Big Data Analytics with PySpark

Big Data Analytics with Pyspark
Start Date: August 14, 2022
Last date to register: August 9, 2022
Course duration: 15 hours
Cost: donation
Skill level: intermediate

Course Description

For whom is this course

Spark is a “lightning-fast cluster computing” framework for Big Data that provides a general data processing platform engine and lets you run programs up to 100x faster in memory, or 10x faster on disk, than Hadoop.

This course is for data science enthusiast learners who will use PySpark, a Python package for Spark programming and its powerful, higher-level libraries such as SparkSQL, MLlib (for machine learning), etc.

At the end of this course, you will have gained an in-depth understanding of PySpark and its application to general Big Data analysis.

What you will learn

You will learn the following topics in this course

  • Pyspark Installation
  • Introduction to Big Data analysis with Spark
  • Programming in PySpark RDD’s
  • PySpark SQL & Data Frames
  • Machine Learning with PySpark MLlib

Prerequisites

  • Python
  • Deep Learning Basis
  • SQL
  • Pandas (Data Frame)

Syllabus

  • Introduction to Big Data analysis with Spark
  • Programming in PySpark RDD’s
  • PySpark SQL & Data Frames
  • Machine Learning with PySpark MLlib

Course Features

Lectures: Hands on
Duration: 15 hours
Students: 100
Certificate: yes
Cost: donation
Skill: intermediate

Video

Instructor

QASIM HASSAN
Machine Learning Engineer @Omdena

Upcoming Courses

JOIN OUR NEWSLETTER

Want to build the skills that matter? Never miss an Omdena Course.