PySpark – Apache Spark Programming in Python for beginners
PySpark – Apache Spark Programming in Python for beginners, available at $94.99, has an average rating of 4.55, with 94 lectures, 9 quizzes, based on 9933 reviews, and has 56853 subscribers.
You will learn about Apache Spark Foundation and Spark Architecture Data Engineering and Data Processing in Spark Working with Data Sources and Sinks Working with Data Frames and Spark SQL Using PyCharm IDE for Spark Development and Debugging Unit Testing, Managing Application Logs and Cluster Deployment This course is ideal for individuals who are Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark or Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark It is particularly useful for Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark or Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark.
Enroll now: PySpark – Apache Spark Programming in Python for beginners
Summary
Title: PySpark – Apache Spark Programming in Python for beginners
Price: $94.99
Average Rating: 4.55
Number of Lectures: 94
Number of Quizzes: 9
Number of Published Lectures: 93
Number of Published Quizzes: 9
Number of Curriculum Items: 103
Number of Published Curriculum Objects: 102
Original Price: $19.99
Quality Status: approved
Status: Live
What You Will Learn
- Apache Spark Foundation and Spark Architecture
- Data Engineering and Data Processing in Spark
- Working with Data Sources and Sinks
- Working with Data Frames and Spark SQL
- Using PyCharm IDE for Spark Development and Debugging
- Unit Testing, Managing Application Logs and Cluster Deployment
Who Should Attend
- Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark
- Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark
Target Audiences
- Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark
- Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark
This course does not require any prior knowledge of Apache Spark or Hadoop. We have taken enough care to explain Spark Architecture and fundamental concepts to help you come up to speed and grasp the content of this course.
About the Course
I am creating PySpark – Apache Spark Programming in Python for beginnerscourse to help you understand Spark programming and apply that knowledge to build data engineering solutions. This course is example-driven and follows a working session-like approach. We will be taking a live coding approach and explaining all the needed concepts along the way.
Who should take this Course?
I designed this course for software engineers willing to develop a Data Engineering pipeline and application using Apache Spark. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Spark implementation. Still, they work with the people who implement Apache Spark at the ground level.
Spark Version used in the Course
This Course is using the Apache Spark 3.5. I have tested all the source code and examples used in this Course on Apache Spark 3.5 in the Databricks environment.
Course Curriculum
Chapter 1: Understanding Big Data and Data Lake
Lecture 1: Section Overview
Lecture 2: What is Big Data and How it Started
Lecture 3: Hadoop Architecture, History, and Evolution
Lecture 4: What is Data Lake and How it works
Lecture 5: Introducing Apache Spark and Databricks Cloud
Chapter 2: Installing and Using Apache Spark
Lecture 1: Section Overview
Lecture 2: Spark Development Environments
Lecture 3: Setup your Databricks Community Cloud Environment
Lecture 4: Introduction to Databricks Workspace
Lecture 5: Create your First Spark Application in Databricks Cloud
Lecture 6: Setup your Local Development IDE
Lecture 7: Mac Users – Setup your Local Development IDE
Lecture 8: Create your First Spark Application using IDE
Lecture 9: Source Code and Other Resources
Chapter 3: Getting Started with Apache Spark
Lecture 1: Micro Project – Problem Statement
Lecture 2: Introduction to Spark Data Frames
Lecture 3: Creating Spark Dataframe
Lecture 4: Creating Spark Tables
Lecture 5: Common problem with Databricks Community
Lecture 6: Working with Spark SQL
Lecture 7: Dataframe Transformations and Actions
Lecture 8: Applying Transformations
Lecture 9: Querying Spark Dataframe
Lecture 10: More Dataframe Transformations
Chapter 4: Spark Execution Model and Architecture
Lecture 1: Execution Methods – How to Run Spark Programs?
Lecture 2: Spark Distributed Processing Model – How your program runs?
Lecture 3: Spark Execution Modes and Cluster Managers
Lecture 4: Summarizing Spark Execution Models – When to use What?
Lecture 5: Working with PySpark Shell – Demo
Lecture 6: Installing Multi-Node Spark Cluster – Demo
Lecture 7: Working with Notebooks in Cluster – Demo
Lecture 8: Working with Spark Submit – Demo
Lecture 9: Section Summary
Chapter 5: Spark Programming Model and Developer Experience
Lecture 1: Creating Spark Project Build Configuration
Lecture 2: Configuring Spark Project Application Logs
Lecture 3: Creating Spark Session
Lecture 4: Configuring Spark Session
Lecture 5: Data Frame Introduction
Lecture 6: Data Frame Partitions and Executors
Lecture 7: Spark Transformations and Actions
Lecture 8: Spark Jobs Stages and Task
Lecture 9: Understanding your Execution Plan
Lecture 10: Unit Testing Spark Application
Lecture 11: Rounding off Summary
Chapter 6: Spark Structured API Foundation
Lecture 1: Introduction to Spark APIs
Lecture 2: Introduction to Spark RDD API
Lecture 3: Working with Spark SQL
Lecture 4: Spark SQL Engine and Catalyst Optimizer
Lecture 5: Section Summary
Chapter 7: Spark Data Sources and Sinks
Lecture 1: Spark Data Sources and Sinks
Lecture 2: Spark DataFrameReader API
Lecture 3: Reading CSV, JSON and Parquet files
Lecture 4: Creating Spark DataFrame Schema
Lecture 5: Spark DataFrameWriter API
Lecture 6: Writing Your Data and Managing Layout
Lecture 7: Spark Databases and Tables
Lecture 8: Working with Spark SQL Tables
Chapter 8: Spark Dataframe and Dataset Transformations
Lecture 1: Introduction to Data Transformation
Lecture 2: Working with Dataframe Rows
Lecture 3: DataFrame Rows and Unit Testing
Lecture 4: Dataframe Rows and Unstructured data
Lecture 5: Working with Dataframe Columns
Lecture 6: Creating and Using UDF
Lecture 7: Misc Transformations
Chapter 9: Aggregations in Apache Spark
Lecture 1: Aggregating Dataframes
Lecture 2: Grouping Aggregations
Lecture 3: Windowing Aggregations
Chapter 10: Spark Dataframe Joins
Lecture 1: Dataframe Joins and column name ambiguity
Lecture 2: Outer Joins in Dataframe
Lecture 3: Internals of Spark Join and shuffle
Lecture 4: Optimizing your joins
Lecture 5: Implementing Bucket Joins
Chapter 11: Capstone Project
Lecture 1: Project Scope and Background
Lecture 2: Data Transformation Requirement
Lecture 3: Setup your starter project
Lecture 4: Test your starter project
Lecture 5: Setup your source control and process
Lecture 6: Creating your Project CI CD Pipeline
Lecture 7: Develop Code
Lecture 8: Write Test Cases
Lecture 9: Working with Kafka integration
Lecture 10: Estimating resources for your application
Chapter 12: Keep Learning
Lecture 1: Final Word
Instructors
-
Prashant Kumar Pandey
Architect, Author, Consultant, Trainer @ Learning Journal -
Learning Journal
Online Training Company
Rating Distribution
- 1 stars: 95 votes
- 2 stars: 110 votes
- 3 stars: 815 votes
- 4 stars: 3561 votes
- 5 stars: 5352 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024