Spark SQL and Spark 3 using Scala Hands-On with Labs
Spark SQL and Spark 3 using Scala Hands-On with Labs, available at $79.99, has an average rating of 4.32, with 232 lectures, based on 2858 reviews, and has 21723 subscribers.
You will learn about All the HDFS Commands that are relevant to validate files and folders in HDFS. Enough Scala to work Data Engineering Projects using Scala as Programming Language Spark Dataframe APIs to solve the problems using Dataframe style APIs. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs Inner as well as outer joins using Spark Data Frame APIs Ability to use Spark SQL to solve the problems using SQL style syntax. Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL Inner as well as outer joins using Spark SQL Basic DDL to create and manage tables using Spark SQL Basic DML or CRUD Operations using Spark SQL Create and Manage Partitioned Tables using Spark SQL Manipulating Data using Spark SQL Functions Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL This course is ideal for individuals who are Any IT aspirant/professional willing to learn Data Engineering using Apache Spark or Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer or Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile It is particularly useful for Any IT aspirant/professional willing to learn Data Engineering using Apache Spark or Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer or Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile.
Enroll now: Spark SQL and Spark 3 using Scala Hands-On with Labs
Summary
Title: Spark SQL and Spark 3 using Scala Hands-On with Labs
Price: $79.99
Average Rating: 4.32
Number of Lectures: 232
Number of Published Lectures: 232
Number of Curriculum Items: 232
Number of Published Curriculum Objects: 232
Original Price: $22.99
Quality Status: approved
Status: Live
What You Will Learn
- All the HDFS Commands that are relevant to validate files and folders in HDFS.
- Enough Scala to work Data Engineering Projects using Scala as Programming Language
- Spark Dataframe APIs to solve the problems using Dataframe style APIs.
- Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs
- Inner as well as outer joins using Spark Data Frame APIs
- Ability to use Spark SQL to solve the problems using SQL style syntax.
- Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL
- Inner as well as outer joins using Spark SQL
- Basic DDL to create and manage tables using Spark SQL
- Basic DML or CRUD Operations using Spark SQL
- Create and Manage Partitioned Tables using Spark SQL
- Manipulating Data using Spark SQL Functions
- Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL
Who Should Attend
- Any IT aspirant/professional willing to learn Data Engineering using Apache Spark
- Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer
- Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile
Target Audiences
- Any IT aspirant/professional willing to learn Data Engineering using Apache Spark
- Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer
- Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile
As part of this course, you will learn all the key skills to build Data Engineering Pipelines using Spark SQL and Spark Data Frame APIs using Scala as a Programming language. This course used to be a CCA 175 Spark and Hadoop Developer course for the preparation of the Certification Exam. As of 10/31/2021, the exam is sunset and we have renamed it to Spark SQL and Spark 3 using Scala as it covers industry-relevant topics beyond the scope of certification.
About Data Engineering
Data Engineering is nothing but processing the data depending on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc. Apache Spark is evolved as a leading technology to take care of Data Engineering at scale.
I have prepared this course for anyone who would like to transition into a Data Engineer role using Spark (Scala). I myself am a proven Data Engineering Solution Architectwith proven experience in designing solutions using Apache Spark.
Let us go through the details about what you will be learning in this course. Keep in mind that the course is created with a lot of hands-on tasks which will give you enough practice using the right tools. Also, there are tons of tasks and exercises to evaluate yourself.
Setup of Single Node Big Data Cluster
Many of you would like to transition to Big Data from Conventional Technologies such as Mainframes, Oracle PL/SQL, etc and you might not have access to Big Data Clusters. It is very important for you set up the environment in the right manner. Don’t worry if you do not have the cluster handy, we will guide you through support via Udemy Q&A.
-
Setup Ubuntu-based AWS Cloud9 Instance with the right configuration
-
Ensure Docker is setup
-
Setup Jupyter Lab and other key components
-
Setup and Validate Hadoop, Hive, YARN, and Spark
Are you feeling a bit overwhelmed about setting up the environment? Don’t worry!!! We will provide complementary lab access for up to 2 months. Here are the details.
-
Training using an interactive environment. You will get 2 weeks of lab access, to begin with. If you like the environment, and acknowledge it by providing a 5* rating and feedback, the lab access will be extended to additional 6 weeks (2 months). Feel free to send an email to support@itversity.com to get complementary lab access. Also, if your employer provides a multi-node environment, we will help you set up the material for the practice as part of the live session. On top of Q&A Support, we also provide required support via live sessions.
A quick recap of Scala
This course requires a decent knowledge of Scala. To make sure you understand Spark from a Data Engineering perspective, we added a module to quickly warm up with Scala. If you are not familiar with Scala, then we suggest you go through relevant courses on Scala as Programming Language.
Data Engineering using Spark SQL
Let us, deep-dive into Spark SQL to understand how it can be used to build Data Engineering Pipelines. Spark with SQL will provide us the ability to leverage distributed computing capabilities of Spark coupled with easy-to-use developer-friendly SQL-style syntax.
-
Getting Started with Spark SQL
-
Basic Transformations using Spark SQL
-
Managing Spark Metastore Tables – Basic DDL and DML
-
Managing Spark Metastore Tables Tables – DML and Partitioning
-
Overview of Spark SQL Functions
-
Windowing Functions using Spark SQL
Data Engineering using Spark Data Frame APIs
Spark Data Frame APIs are an alternative way of building Data Engineering applications at scale leveraging distributed computing capabilities of Spark. Data Engineers from application development backgrounds might prefer Data Frame APIs over Spark SQL to build Data Engineering applications.
-
Data Processing Overview using Spark Data Frame APIs leveraging Scala as Programming Language
-
Processing Column Data using Spark Data Frame APIs leveraging Scala as Programming Language
-
Basic Transformations using Spark Data Frame APIs leveraging Scala as Programming Language – Filtering, Aggregations, and Sorting
-
Joining Data Sets using Spark Data Frame APIs leveraging Scala as Programming Language
All the demos are given on our state-of-the-art Big Data cluster. You can avail of one-month complimentary lab access by reaching out to support@itversity.com with a Udemy receipt.
Course Curriculum
Chapter 1: Introduction
Lecture 1: CCA 175 Spark and Hadoop Developer – Curriculum
Chapter 2: Setting up Environment using AWS Cloud9
Lecture 1: Getting Started with Cloud9
Lecture 2: Creating Cloud9 Environment
Lecture 3: Warming up with Cloud9 IDE
Lecture 4: Overview of EC2 related to Cloud9
Lecture 5: Opening ports for Cloud9 Instance
Lecture 6: Associating Elastic IPs to Cloud9 Instance
Lecture 7: Increase EBS Volume Size of Cloud9 Instance
Lecture 8: Setup Jupyter Lab on Cloud9
Lecture 9: [Commands] Setup Jupyter Lab on Cloud9
Chapter 3: Setting up Environment – Overview of GCP and Provision Ubuntu VM
Lecture 1: Signing up for GCP
Lecture 2: Overview of GCP Web Console
Lecture 3: Overview of GCP Pricing
Lecture 4: Provision Ubuntu VM from GCP
Lecture 5: Setup Docker
Lecture 6: Why we are setting up Python and Jupyter Lab for Scala related course?
Lecture 7: Validating Python
Lecture 8: Setup Jupyter Lab
Chapter 4: Setup Hadoop on Single Node Cluster
Lecture 1: Introduction to Single Node Hadoop Cluster
Lecture 2: Setup Prerequisties
Lecture 3: [Commands] – Setup Prerequisites
Lecture 4: Setup Password less login
Lecture 5: [Commands] – Setup Password less login
Lecture 6: Download and Install Hadoop
Lecture 7: [Commands] – Download and Install Hadoop
Lecture 8: Configure Hadoop HDFS
Lecture 9: [Commands] – Configure Hadoop HDFS
Lecture 10: Start and Validate HDFS
Lecture 11: [Commands] – Start and Validate HDFS
Lecture 12: Configure Hadoop YARN
Lecture 13: [Commands] – Configure Hadoop YARN
Lecture 14: Start and Validate YARN
Lecture 15: [Commands] – Start and Validate YARN
Lecture 16: Managing Single Node Hadoop
Lecture 17: [Commands] – Managing Single Node Hadoop
Chapter 5: Setup Hive and Spark on Single Node Cluster
Lecture 1: Setup Data Sets for Practice
Lecture 2: [Commands] – Setup Data Sets for Practice
Lecture 3: Download and Install Hive
Lecture 4: [Commands] – Download and Install Hive
Lecture 5: Setup Database for Hive Metastore
Lecture 6: [Commands] – Setup Database for Hive Metastore
Lecture 7: Configure and Setup Hive Metastore
Lecture 8: [Commands] – Configure and Setup Hive Metastore
Lecture 9: Launch and Validate Hive
Lecture 10: [Commands] – Launch and Validate Hive
Lecture 11: Scripts to Manage Single Node Cluster
Lecture 12: [Commands] – Scripts to Manage Single Node Cluster
Lecture 13: Download and Install Spark 2
Lecture 14: [Commands] – Download and Install Spark 2
Lecture 15: Configure Spark 2
Lecture 16: [Commands] – Configure Spark 2
Lecture 17: Validate Spark 2 using CLIs
Lecture 18: [Commands] – Validate Spark 2 using CLIs
Lecture 19: Validate Jupyter Lab Setup
Lecture 20: [Commands] – Validate Jupyter Lab Setup
Lecture 21: Intergrate Spark 2 with Jupyter Lab
Lecture 22: [Commands] – Intergrate Spark 2 with Jupyter Lab
Lecture 23: Download and Install Spark 3
Lecture 24: [Commands] – Download and Install Spark 3
Lecture 25: Configure Spark 3
Lecture 26: [Commands] – Configure Spark 3
Lecture 27: Validate Spark 3 using CLIs
Lecture 28: [Commands] – Validate Spark 3 using CLIs
Lecture 29: Intergrate Spark 3 with Jupyter Lab
Lecture 30: [Commands] – Intergrate Spark 3 with Jupyter Lab
Chapter 6: Scala Fundamentals
Lecture 1: Introduction and Setting up of Scala
Lecture 2: Setup Scala on Windows
Lecture 3: Basic Programming Constructs
Lecture 4: Functions
Lecture 5: Object Oriented Concepts – Classes
Lecture 6: Object Oriented Concepts – Objects
Lecture 7: Object Oriented Concepts – Case Classes
Lecture 8: Collections – Seq, Set and Map
Lecture 9: Basic Map Reduce Operations
Lecture 10: Setting up Data Sets for Basic I/O Operations
Lecture 11: Basic I/O Operations and using Scala Collections APIs
Lecture 12: Tuples
Lecture 13: Development Cycle – Create Program File
Lecture 14: Development Cycle – Compile source code to jar using SBT
Lecture 15: Development Cycle – Setup SBT on Windows
Lecture 16: Development Cycle – Compile changes and run jar with arguments
Lecture 17: Development Cycle – Setup IntelliJ with Scala
Lecture 18: Development Cycle – Develop Scala application using SBT in IntelliJ
Chapter 7: Overview of Hadoop HDFS Commands
Lecture 1: Getting help or usage of HDFS Commands
Lecture 2: Listing HDFS Files
Lecture 3: Managing HDFS Directories
Lecture 4: Copying files from local to HDFS
Lecture 5: Copying files from HDFS to local
Lecture 6: Getting File Metadata
Lecture 7: Previewing Data in HDFS File
Lecture 8: HDFS Block Size
Lecture 9: HDFS Replication Factor
Lecture 10: Getting HDFS Storage Usage
Instructors
-
Durga Viswanatha Raju Gadiraju
CEO at ITVersity and CTO at Analytiqs, Inc -
Madhuri Gadiraju
-
Sathvika Dandu
-
Pratik Kumar
-
Sai Varma
-
Phani Bhushan Bozzam
Rating Distribution
- 1 stars: 81 votes
- 2 stars: 124 votes
- 3 stars: 400 votes
- 4 stars: 1038 votes
- 5 stars: 1215 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Language Learning Courses to Learn in November 2024
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024