Omdena Academy Courses
Big Data Analytics with PySpark
December 29, 2023
For whom is this course?
Spark is a “lightning-fast cluster computing” framework for Big Data that provides a general data processing platform engine and lets you run programs up to 100x faster in memory, or 10x faster on disk, than Hadoop.
This course is for data science enthusiast learners who will use PySpark, a Python package for Spark programming and its powerful, higher-level libraries such as SparkSQL, MLlib (for machine learning), etc.
At the end of this course, you will have gained an in-depth understanding of PySpark and its application to general Big Data analysis.
What will you learn?
You will learn the following topics in this course
- Pyspark Installation
- Introduction to Big Data analysis with Spark
- Programming in PySpark RDD’s
- PySpark SQL & Data Frames
- Machine Learning with PySpark MLlib
Prerequisites
- Python
- Deep Learning Basis
- SQL
- Pandas (Data Frame)
Syllabus
- Introduction to Big Data analysis with Spark
- Programming in PySpark RDD’s
- PySpark SQL & Data Frames
- Machine Learning with PySpark MLlib
Instructors
Course Info
Certificateyes
Duration15 hours
Start DateAugust 14, 2022
Last Registration DateAugust 9, 2022
No of Students100
Skill Levelintermediate
View more Courses
View all courses from Omdena Academy Go Back