Omdena Academy Courses

Big Data Analytics with PySpark

December 29, 2023


Omdena Course Featured Image

For whom is this course?

Spark is a “lightning-fast cluster computing” framework for Big Data that provides a general data processing platform engine and lets you run programs up to 100x faster in memory, or 10x faster on disk, than Hadoop.

This course is for data science enthusiast learners who will use PySpark, a Python package for Spark programming and its powerful, higher-level libraries such as SparkSQL, MLlib (for machine learning), etc.

At the end of this course, you will have gained an in-depth understanding of PySpark and its application to general Big Data analysis.


What will you learn?

You will learn the following topics in this course

  • Pyspark Installation
  • Introduction to Big Data analysis with Spark
  • Programming in PySpark RDD’s
  • PySpark SQL & Data Frames
  • Machine Learning with PySpark MLlib

Prerequisites

  • Python
  • Deep Learning Basis
  • SQL
  • Pandas (Data Frame)

Syllabus

  • Introduction to Big Data analysis with Spark
  • Programming in PySpark RDD’s
  • PySpark SQL & Data Frames
  • Machine Learning with PySpark MLlib

Instructors




Course Info

Certificateyes
Duration15 hours
Start DateAugust 14, 2022
Last Registration DateAugust 9, 2022
No of Students100
Skill Levelintermediate

View more Courses

media card
View all courses from Omdena Academy Go Back