Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB
Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB, available at $74.99, has an average rating of 4.56, with 159 lectures, based on 1700 reviews, and has 14364 subscribers.
You will learn about Hadoop Ecosystem, Sqoop, Flume, Hive Expertise on writing code with Apache Spark Learn Kafka Fundamentals and using Kafka Connectors Learn writing queries and client in MongoDB Learn Data Engineering technologies This course is ideal for individuals who are Who want to learn Big data technologies or Who want to become Data Engineers It is particularly useful for Who want to learn Big data technologies or Who want to become Data Engineers.
Enroll now: Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB
Summary
Title: Data Engineering Master Course: Spark/Hadoop/Kafka/MongoDB
Price: $74.99
Average Rating: 4.56
Number of Lectures: 159
Number of Published Lectures: 151
Number of Curriculum Items: 161
Number of Published Curriculum Objects: 153
Original Price: $19.99
Quality Status: approved
Status: Live
What You Will Learn
- Hadoop Ecosystem, Sqoop, Flume, Hive
- Expertise on writing code with Apache Spark
- Learn Kafka Fundamentals and using Kafka Connectors
- Learn writing queries and client in MongoDB
- Learn Data Engineering technologies
Who Should Attend
- Who want to learn Big data technologies
- Who want to become Data Engineers
Target Audiences
- Who want to learn Big data technologies
- Who want to become Data Engineers
In this course, you will start by learning what is hadoop distributed file system and most common hadoop commands required to work with Hadoop File system.
Then you will be introduced to Sqoop Import
-
Understand lifecycle of sqoop command.
-
Use sqoop import command to migrate data from Mysql to HDFS.
-
Use sqoop import command to migrate data from Mysql to Hive.
-
Use various file formats, compressions, file delimeter,where clause and queries while importing the data.
-
Understand split-by and boundary queries.
-
Use incremental mode to migrate the data from Mysql to HDFS.
Further, you will learn Sqoop Export to migrate data.
-
What is sqoop export
-
Using sqoop export, migrate data from HDFS to Mysql.
-
Using sqoop export, migrate data from Hive to Mysql.
Further, you will learn about Apache Flume
-
Understand Flume Architecture.
-
Using flume, Ingest data from Twitter and save to HDFS.
-
Using flume, Ingest data from netcat and save to HDFS.
-
Using flume, Ingest data from exec and show on console.
-
Describe flume interceptors and see examples of using interceptors.
-
Flume multiple agents
-
Flume Consolidation.
In the next section, we will learn about Apache Hive
-
Hive Intro
-
External & Managed Tables
-
Working with Different Files – Parquet,Avro
-
Compressions
-
Hive Analysis
-
Hive String Functions
-
Hive Date Functions
-
Partitioning
-
Bucketing
You will learn about Apache Spark
-
Spark Intro
-
Cluster Overview
-
RDD
-
DAG/Stages/Tasks
-
Actions & Transformations
-
Transformation & Action Examples
-
Spark Data frames
-
Spark Data frames – working with diff File Formats & Compression
-
Dataframes API’s
-
Spark SQL
-
Dataframe Examples
-
Spark with Cassandra Integration
-
Running Spark on Intellij IDE
-
Running Spark on EMR
You will learn about Apache Kafka
-
Kafka Architecture
-
Partitions and offsets
-
Kafka Producers and Consumers
-
Kafka SerDEs
-
Kafka Messages
-
Kafka Connector
-
Ingesting Data using Kafka Connector
You will learn about MongoDB
-
MongoDB Usecases
-
CRUD Operations
-
MongoDB Operators
-
Working with Arrays
-
MongoDB with Spark
Data Engineering Interview Preparation
-
Sqoop Interview Questions
-
Hive Interview Questions
-
Spark Interview Questions
-
Data Engineering common questions
-
Data Engineering Real project questions.
Course Curriculum
Chapter 1: Big Data Introduction
Lecture 1: Meet your Instructor
Lecture 2: Course Intro
Lecture 3: Big Data Intro
Lecture 4: Understanding Big Data Ecosystem
Chapter 2: Google Cloud Cluster Setup
Lecture 1: Google Cloud Account Setup
Lecture 2: Dataproc Cluster Setup – Part1
Lecture 3: DataProc Cluster Setup – Part2
Lecture 4: Upload Files on Google Cloud
Lecture 5: Sqoop Setup
Lecture 6: Environment Update
Chapter 3: Hadoop & Yarn
Lecture 1: HDFS and Hadoop Commands
Lecture 2: Yarn Cluster Overview
Chapter 4: Sqoop Import
Lecture 1: Sqoop Introduction
Lecture 2: Managing Target Directories
Lecture 3: Working with Different Compressions
Lecture 4: Conditional Imports
Lecture 5: Split-by and Boundary Queries
Lecture 6: Field delimeters
Lecture 7: Incremental Appends
Lecture 8: Sqoop-Hive Cluster Fix
Lecture 9: Access Hive on Google Cloud
Lecture 10: Sqoop Hive Import
Lecture 11: Sqoop List Tables/Database
Lecture 12: Sqoop Import Practice1
Lecture 13: Sqoop Import Practice2
Chapter 5: Sqoop Export
Lecture 1: Export from Hdfs to Mysql
Lecture 2: Export from Hive to Mysql
Lecture 3: Export Avro Compressed to Mysql
Lecture 4: Bonus Lecture: Sqoop with Airflow
Chapter 6: Apache Flume
Lecture 1: Flume Setup
Lecture 2: Flume Introduction & Architecture
Lecture 3: Exec Source and Logger Sink
Lecture 4: Moving data from Twitter to HDFS
Lecture 5: Moving data from NetCat to HDFS
Lecture 6: Flume Interceptors
Lecture 7: Flume Interceptor Example
Lecture 8: Flume Multi-Agent Flow
Lecture 9: Flume Consolidation
Chapter 7: Apache Hive
Lecture 1: Access Hive Shell on Google Cloud
Lecture 2: Hive Introduction
Lecture 3: Hive Database
Lecture 4: Hive Managed Tables
Lecture 5: Hive External Tables
Lecture 6: Hive Inserts
Lecture 7: Hive Analytics
Lecture 8: Working with Parquet
Lecture 9: Compressing Parquet
Lecture 10: Working with Fixed File Format
Lecture 11: Alter Command
Lecture 12: Hive String Functions
Lecture 13: Hive Date Functions
Lecture 14: Hive Partitioning
Lecture 15: Hive Bucketing
Chapter 8: Spark with Yarn & HDFS
Lecture 1: What is Apache Spark
Lecture 2: Understanding Cluster Manager (Yarn)
Lecture 3: Understanding Distributed Storage (HDFS)
Lecture 4: Running Spark on Yarn/HDFS
Lecture 5: Understanding Deploy Modes
Chapter 9: GCS Cluster
Lecture 1: Spark on GCS Cluster
Lecture 2: Upload Data files for Spark
Chapter 10: Spark Internals
Lecture 1: Drivers & Executors
Lecture 2: RDDs & Dataframes
Lecture 3: Transformation & Actions
Lecture 4: Wide & Narrow Transformations
Lecture 5: Understanding Execution Plan
Lecture 6: Different Plans by Driver
Chapter 11: Spark RDD : Transformation & Actions
Lecture 1: Map/FlatMap Transformation
Lecture 2: Filter/Intersection
Lecture 3: Union/Distinct Transformation
Lecture 4: GroupByKey/ Group people based on Birthday months
Lecture 5: ReduceByKey / Total Number of students in each Subject
Lecture 6: SortByKey / Sort students based on their rollno
Lecture 7: MapPartition / MapPartitionWithIndex
Lecture 8: Change number of Partitions
Lecture 9: Join / join email address based on customer name
Lecture 10: Spark Actions
Chapter 12: Spark RDD Practice
Lecture 1: Upload Files
Lecture 2: Scala Tuples
Lecture 3: Filter Error Logs
Lecture 4: Frequency of word in Text File
Lecture 5: Population of each city
Lecture 6: Orders placed by Customers
Lecture 7: average rating of movie
Chapter 13: Spark Dataframes & Spark SQL
Lecture 1: Dataframe Intro
Lecture 2: Dafaframe from Json Files
Instructors
-
Navdeep Kaur
Premium Instructor
Rating Distribution
- 1 stars: 22 votes
- 2 stars: 39 votes
- 3 stars: 177 votes
- 4 stars: 633 votes
- 5 stars: 829 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Digital Marketing Foundation Course
- Google Shopping Ads Digital Marketing Course
- Multi Cloud Infrastructure for beginners
- Master Lead Generation: Grow Subscribers & Sales with Popups
- Complete Copywriting System : write to sell with ease
- Product Positioning Masterclass: Unlock Market Traction
- How to Promote Your Webinar and Get More Attendees?
- Digital Marketing Courses
- Create music with Artificial Intelligence in this new market
- Create CONVERTING UGC Content So Brands Will Pay You More
- Podcast: The top 8 ways to monetize by Podcasting
- TikTok Marketing Mastery: Learn to Grow & Go Viral
- Free Digital Marketing Basics Course in Hindi
- MailChimp Free Mailing Lists: MailChimp Email Marketing
- Automate Digital Marketing & Social Media with Generative AI
- Google Ads MasterClass – All Advanced Features
- Online Course Creator: Create & Sell Online Courses Today!
- Introduction to SEO – Basic Principles of SEO
- Affiliate Marketing For Beginners: Go From Novice To Pro
- Effective Website Planning Made Simple