The Ultimate Hands-On Hadoop: Tame your Big Data!
The Ultimate Hands-On Hadoop: Tame your Big Data!, available at $129.99, has an average rating of 4.54, with 109 lectures, based on 30267 reviews, and has 182671 subscribers.
You will learn about Design distributed systems that manage "big data" using Hadoop and related data engineering technologies. Use HDFS and MapReduce for storing and analyzing data at scale. Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways. Analyze relational data using Hive and MySQL Analyze non-relational data using HBase, Cassandra, and MongoDB Query data interactively with Drill, Phoenix, and Presto Choose an appropriate data storage technology for your application Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie. Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume Consume streaming data using Spark Streaming, Flink, and Storm This course is ideal for individuals who are Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale. or Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop. or Data analysts and database administrators who are curious about Hadoop and how it relates to their work. or System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together. It is particularly useful for Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale. or Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop. or Data analysts and database administrators who are curious about Hadoop and how it relates to their work. or System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.
Enroll now: The Ultimate Hands-On Hadoop: Tame your Big Data!
Summary
Title: The Ultimate Hands-On Hadoop: Tame your Big Data!
Price: $129.99
Average Rating: 4.54
Number of Lectures: 109
Number of Published Lectures: 104
Number of Curriculum Items: 109
Number of Published Curriculum Objects: 104
Original Price: $24.99
Quality Status: approved
Status: Live
What You Will Learn
- Design distributed systems that manage "big data" using Hadoop and related data engineering technologies.
- Use HDFS and MapReduce for storing and analyzing data at scale.
- Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
- Analyze relational data using Hive and MySQL
- Analyze non-relational data using HBase, Cassandra, and MongoDB
- Query data interactively with Drill, Phoenix, and Presto
- Choose an appropriate data storage technology for your application
- Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
- Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
- Consume streaming data using Spark Streaming, Flink, and Storm
Who Should Attend
- Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale.
- Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop.
- Data analysts and database administrators who are curious about Hadoop and how it relates to their work.
- System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.
Target Audiences
- Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale.
- Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop.
- Data analysts and database administrators who are curious about Hadoop and how it relates to their work.
- System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.
The world of Hadoopand “Big Data” can be intimidating – hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this Hadoop tutorial, you’ll not only understand what those systems are and how they fit together – but you’ll go hands-on and learn how to use them to solve real business problems!
Learn and master the most popular data engineering technologies in this comprehensive course, taught by a former engineer and senior manager from Amazonand IMDb. We’ll go way beyond Hadoop itself, and dive into all sorts of distributed systems you may need to integrate with.
-
Install and work with a real Hadoop installation right on your desktop with Hortonworks (now part of Cloudera) and the AmbariUI
-
Manage big data on a cluster with HDFSand MapReduce
-
Write programs to analyze data on Hadoop with Pigand Spark
-
Store and query your data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto
-
Design real-world systems using the Hadoop ecosystem
-
Learn how your cluster is managed with YARN, Mesos, Zookeeper, Oozie, Zeppelin, and Hue
-
Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm
Spark and Hadoop developers are hugely valued at companies with large amounts of data; these are very marketable skills to learn.
Almost every large company you might want to work at uses Hadoop in some way, including Amazon, Ebay, Facebook, Google, LinkedIn, IBM, Spotify, Twitter, and Yahoo! And it’s not just technology companies that need Hadoop; even the New York Times uses Hadoop for processing images.
This course is comprehensive, covering over 25 different technologiesin over 14 hours of video lectures. It’s filled with hands-on activities and exercises, so you get some real experience in using Hadoop – it’s not just theory.
You’ll find a range of activities in this course for people at every level. If you’re a project manager who just wants to learn the buzzwords, there are web UI’s for many of the activities in the course that require no programming knowledge. If you’re comfortable with command lines, we’ll show you how to work with them too. And if you’re a programmer, I’ll challenge you with writing real scripts on a Hadoop system using Scala, Pig Latin, and Python.
You’ll walk away from this course with a real, deep understanding of Hadoop and its associated distributed systems, and you can apply Hadoop to real-world problems. Plus a valuable completion certificate is waiting for you at the end!
Please note the focus on this course is on application development, not Hadoop administration. Although you will pick up some administration skills along the way.
Knowing how to wrangle “big data” is an incredibly valuable skill for today’s top tech employers. Don’t be left behind – enroll now!
-
“The Ultimate Hands-On Hadoop… was a crucial discovery for me. I supplemented your course with a bunch of literature and conferences until I managed to land an interview. I can proudly say that I landed a job as a Big Data Engineer around a year after I started your course. Thanks so much for all the great content you have generated and the crystal clear explanations. ” – Aldo Serrano
-
“I honestly wouldn’t be where I am now without this course. Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment. This course helped me achieve a far greater understanding of the environment and its capabilities. Frank makes the complex simple by helping you through the process every step of the way. Highly recommended and worth your time especially the Spark environment.” – Tyler Buck
Course Curriculum
Chapter 1: Learn all the buzzwords! And install the Hortonworks Data Platform Sandbox.
Lecture 1: Udemy 101: Getting the Most From This Course
Lecture 2: Tips for Using This Course
Lecture 3: If you have trouble downloading Hortonworks Data Platform…
Lecture 4: Warning for Apple M1 users
Lecture 5: Installing Hadoop [Step by Step]
Lecture 6: The Hortonworks and Cloudera Merger, and how it affects this course.
Lecture 7: Hadoop Overview and History
Lecture 8: Overview of the Hadoop Ecosystem
Lecture 9: Important note
Chapter 2: Using Hadoop's Core: HDFS and MapReduce
Lecture 1: HDFS: What it is, and how it works
Lecture 2: Alternate MovieLens download location
Lecture 3: Installing the MovieLens Dataset
Lecture 4: [Activity] Install the MovieLens dataset into HDFS using the command line
Lecture 5: MapReduce: What it is, and how it works
Lecture 6: How MapReduce distributes processing
Lecture 7: MapReduce example: Break down movie ratings by rating score
Lecture 8: [Activity] Install Python, MRJob, and nano
Lecture 9: [Activity] Code up the ratings histogram MapReduce job and run it
Lecture 10: [Exercise] Rank movies by their popularity
Lecture 11: Note: Sorting will only work by partition.
Lecture 12: [Activity] Check your results against mine!
Chapter 3: Programming Hadoop with Pig
Lecture 1: Introducing Ambari
Lecture 2: Introducing Pig
Lecture 3: Example: Find the oldest movie with a 5-star rating using Pig
Lecture 4: [Activity] Find old 5-star movies with Pig
Lecture 5: More Pig Latin
Lecture 6: [Exercise] Find the most-rated one-star movie
Lecture 7: Pig Challenge: Compare Your Results to Mine!
Chapter 4: Programming Hadoop with Spark
Lecture 1: Why Spark?
Lecture 2: The Resilient Distributed Dataset (RDD)
Lecture 3: [Activity] Find the movie with the lowest average rating – with RDD's
Lecture 4: Datasets and Spark 2.0
Lecture 5: [Activity] Find the movie with the lowest average rating – with DataFrames
Lecture 6: [Activity] Movie recommendations with MLLib
Lecture 7: [Exercise] Filter the lowest-rated movies by number of ratings
Lecture 8: [Activity] Check your results against mine!
Chapter 5: Using relational data stores with Hadoop
Lecture 1: What is Hive?
Lecture 2: [Activity] Use Hive to find the most popular movie
Lecture 3: How Hive works
Lecture 4: [Exercise] Use Hive to find the movie with the highest average rating
Lecture 5: Compare your solution to mine.
Lecture 6: Integrating MySQL with Hadoop
Lecture 7: Cheat sheet for the following lecture
Lecture 8: [Activity] Install MySQL and import our movie data
Lecture 9: [Activity] Use Sqoop to import data from MySQL to HFDS/Hive
Lecture 10: [Activity] Use Sqoop to export data from Hadoop to MySQL
Chapter 6: Using non-relational data stores with Hadoop
Lecture 1: Why NoSQL?
Lecture 2: What is HBase
Lecture 3: [Activity] Import movie ratings into HBase
Lecture 4: [Activity] Use HBase with Pig to import data at scale.
Lecture 5: Cassandra overview
Lecture 6: If you have trouble installing Cassandra…
Lecture 7: [Activity] Installing Cassandra
Lecture 8: [Activity] Write Spark output into Cassandra
Lecture 9: MongoDB overview
Lecture 10: [Activity] Install MongoDB, and integrate Spark with MongoDB
Lecture 11: [Activity] Using the MongoDB shell
Lecture 12: Choosing a database technology
Lecture 13: [Exercise] Choose a database for a given problem
Chapter 7: Querying your Data Interactively
Lecture 1: Overview of Drill
Lecture 2: [Activity] Setting up Drill
Lecture 3: [Activity] Querying across multiple databases with Drill
Lecture 4: Overview of Phoenix
Lecture 5: [Activity] Install Phoenix and query HBase with it
Lecture 6: [Activity] Integrate Phoenix with Pig
Lecture 7: Overview of Presto
Lecture 8: [Activity] Install Presto, and query Hive with it.
Lecture 9: [Activity] Query both Cassandra and Hive using Presto.
Chapter 8: Managing your Cluster
Lecture 1: YARN explained
Lecture 2: Tez explained
Lecture 3: [Activity] Use Hive on Tez and measure the performance benefit
Lecture 4: Mesos explained
Lecture 5: ZooKeeper explained
Lecture 6: [Activity] Simulating a failing master with ZooKeeper
Lecture 7: Oozie explained
Lecture 8: [Activity] Set up a simple Oozie workflow
Lecture 9: Zeppelin overview
Lecture 10: [Activity] Use Zeppelin to analyze movie ratings, part 1
Lecture 11: [Activity] Use Zeppelin to analyze movie ratings, part 2
Lecture 12: Hue overview
Lecture 13: Other technologies worth mentioning
Chapter 9: Feeding Data to your Cluster
Lecture 1: Kafka explained
Lecture 2: [Activity] Setting up Kafka, and publishing some data.
Lecture 3: [Activity] Publishing web logs with Kafka
Lecture 4: Flume explained
Lecture 5: [Activity] Set up Flume and publish logs with it.
Lecture 6: [Activity] Set up Flume to monitor a directory and store its data in HDFS
Chapter 10: Analyzing Streams of Data
Lecture 1: Spark Streaming: Introduction
Lecture 2: [Activity] Analyze web logs published with Flume using Spark Streaming
Lecture 3: [Exercise] Monitor Flume-published logs for errors in real time
Instructors
-
Sundog Education by Frank Kane
Join over 800K students learning ML, AI, AWS, and Data Eng. -
Frank Kane
Ex-Amazon Sr. Engineer and Sr. Manager, CEO Sundog Education -
Sundog Education Team
Sundog Education Team
Rating Distribution
- 1 stars: 301 votes
- 2 stars: 349 votes
- 3 stars: 2302 votes
- 4 stars: 10131 votes
- 5 stars: 17187 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Digital Marketing Foundation Course
- Google Shopping Ads Digital Marketing Course
- Multi Cloud Infrastructure for beginners
- Master Lead Generation: Grow Subscribers & Sales with Popups
- Complete Copywriting System : write to sell with ease
- Product Positioning Masterclass: Unlock Market Traction
- How to Promote Your Webinar and Get More Attendees?
- Digital Marketing Courses
- Create music with Artificial Intelligence in this new market
- Create CONVERTING UGC Content So Brands Will Pay You More
- Podcast: The top 8 ways to monetize by Podcasting
- TikTok Marketing Mastery: Learn to Grow & Go Viral
- Free Digital Marketing Basics Course in Hindi
- MailChimp Free Mailing Lists: MailChimp Email Marketing
- Automate Digital Marketing & Social Media with Generative AI
- Google Ads MasterClass – All Advanced Features
- Online Course Creator: Create & Sell Online Courses Today!
- Introduction to SEO – Basic Principles of SEO
- Affiliate Marketing For Beginners: Go From Novice To Pro
- Effective Website Planning Made Simple