Solving 10 Hadoop'able Problems
Solving 10 Hadoop'able Problems, available at $19.99, has an average rating of 3.67, with 40 lectures, based on 3 reviews, and has 91 subscribers.
You will learn about Explore the Hadoop big data Ecosystem in a nutshell Process payment data from an event stream using the streaming API: Payment Analyzer Detect BOT traffic using Spark Streaming, make log data queryable, and investigate customer data Supply Chain analysis – find top-seller items in a streaming way, enhance top-seller items Analyze Customer churn amounts quantitatively with DataFrame queries Perform IoT sensor data analysis with device response to system failures and data streams High-performance computation with neighborhood aggregations Page ranking using Spark GraphX Threat Analysis – Analyzing weblogs for suspicious activity and anomalies in network traffic Extract information from unstructured text via Spark DataFrames Perform sentiment analysis of posts using Logistic Regression, and find the author of a post Find what product users want to buy using Cloudera Sandbox Toolkit Use movie history to suggest content, and test and experiment with Recommendation Enginec This course is ideal for individuals who are Data Engineers, and Machine Learning and Data analysts It is particularly useful for Data Engineers, and Machine Learning and Data analysts.
Enroll now: Solving 10 Hadoop'able Problems
Summary
Title: Solving 10 Hadoop'able Problems
Price: $19.99
Average Rating: 3.67
Number of Lectures: 40
Number of Published Lectures: 40
Number of Curriculum Items: 40
Number of Published Curriculum Objects: 40
Original Price: $109.99
Quality Status: approved
Status: Live
What You Will Learn
- Explore the Hadoop big data Ecosystem in a nutshell
- Process payment data from an event stream using the streaming API: Payment Analyzer
- Detect BOT traffic using Spark Streaming, make log data queryable, and investigate customer data
- Supply Chain analysis – find top-seller items in a streaming way, enhance top-seller items
- Analyze Customer churn amounts quantitatively with DataFrame queries
- Perform IoT sensor data analysis with device response to system failures and data streams
- High-performance computation with neighborhood aggregations
- Page ranking using Spark GraphX
- Threat Analysis – Analyzing weblogs for suspicious activity and anomalies in network traffic
- Extract information from unstructured text via Spark DataFrames
- Perform sentiment analysis of posts using Logistic Regression, and find the author of a post
- Find what product users want to buy using Cloudera Sandbox Toolkit
- Use movie history to suggest content, and test and experiment with Recommendation Enginec
Who Should Attend
- Data Engineers, and Machine Learning and Data analysts
Target Audiences
- Data Engineers, and Machine Learning and Data analysts
The Apache Hadoop ecosystem is a popular and powerful tool to solve big data problems. With so many competing tools to process data, many users want to know which particular problems are well suited to Hadoop, and how to implement those solutions.
To know what types of problems are Hadoop-able it is good to start with a basic understanding of the core components of Hadoop. You will learn about the ecosystem designed to run on top of Hadoop as well as software that is deployed alongside it. These tools give us the building blocks to build data processing applications. This course covers the core parts of the Hadoop ecosystem, helping to give a broad understanding and get you up-and-running fast. Next, it describes a number of common problems as case-study projects Hadoop is able to solve. These sections are broken down into sections by different projects, each serving as a specific use case for solving big data problems.
By the end of this course, you will have been exposed to a wide variety of Hadoop software and examples of how it is used to solve common big data problems.
About the Author
Tomasz Lelek is a Software Engineer who programs mostly in Java and Scala. He is a fan of microservice architectures and functional programming. He dedicates considerable time and effort to be better every day. Recently, he’s been delving into big data technologies such as Apache Spark and Hadoop. He is passionate about nearly everything associated with software development.
Tomasz thinks that we should always try to consider different solutions and approaches to solving a problem. Recently, he was a speaker at several conferences in Poland – Confitura and JDD (Java Developer’s Day) and also at Krakow Scala User Group.
He also conducted a live coding session at Geecon Conference.
Course Curriculum
Chapter 1: Core Components
Lecture 1: The Course Overview
Lecture 2: Hadoop Distributed File System (HDFS)
Lecture 3: Distributed Compute Capability YARN
Chapter 2: Downstream Ecosystem
Lecture 1: Apache Hive for ETL and SQL Like
Lecture 2: Message Queuing and Data Ingestion Kafka
Lecture 3: NoSQL Datastores – Hadoop HBase, Accumulo
Lecture 4: Machine Learning – Spark and Spark MLlib
Lecture 5: Stream Processing – Spark Streaming
Chapter 3: Financial, Trade, and Time Series Applications – Trade Surveillance
Lecture 1: Processing Payment Data from an Event Stream
Lecture 2: Advanced Aggregations Using Streaming API – PaymentAnalyzer
Lecture 3: Storing Time Series Data in HBase
Chapter 4: AdTech – Ad Targeting
Lecture 1: Detecting BOT Traffic Using Spark Streaming
Lecture 2: Make Web Log Data Queryable – Hive Sink
Lecture 3: Investigating Customers Data in Hive
Chapter 5: Business/Point of Sale – Transaction Analysis
Lecture 1: Trending Supply Chain – Finding Top Seller Item in a Streaming Way
Lecture 2: Enriching Top Sellers with Additional Information
Chapter 6: Customer Churn Analysis
Lecture 1: Analyzing Customer Churn (Quantitative) Using DataFrame Queries
Lecture 2: Analyzing Customer Churn (Amounts) Using DataFrame Queries
Chapter 7: Internet of Things
Lecture 1: Storing Low Granularity Structured Sensor Data in HBase
Lecture 2: Consuming Sensor Data Stored in HBase – Scan and Count
Lecture 3: Building Summaries on Data Streaming from Devices
Chapter 8: Scientific and High Performance Computing
Lecture 1: Introducing Spark GraphX – How to Represent a Graph?
Lecture 2: Perform Graph Operations Using GraphX
Lecture 3: Counting Degree of Vertices
Lecture 4: Neighborhood Aggregations – Collecting Neighbors
Lecture 5: Structural Operators – Connected Components
Lecture 6: Page Rank Using Spark GraphX
Chapter 9: Security Concerns Intrusion Detection – Threat Analysis
Lecture 1: Anomaly Detection
Lecture 2: Analyzing Web Logs for Suspicious Activity and Loading into Spark
Lecture 3: Implementing Clustering – Choosing Number of Clusters
Lecture 4: Detecting Anomalies in Network Traffic
Chapter 10: Text Analysis
Lecture 1: Analyzing Post for an Author
Lecture 2: Extracting Information from Unstructured Text
Lecture 3: Extracting Information Via Spark DataFrame
Lecture 4: Sentiment Analysis of Posts Using Logistic Regression
Lecture 5: Finding an Author of a Post
Chapter 11: Data Warehouse/Data Lake/ Data Sandbox
Lecture 1: Downloading and Setting Cloudera Sandbox
Lecture 2: Finding What Products Users Wants to Buy Using Cloudera Sandbox Toolkit
Chapter 12: Personalization
Lecture 1: Using Movies History to Suggest Interesting Content
Lecture 2: Testing and Experimenting with Recommendation Engine
Instructors
-
Packt Publishing
Tech Knowledge in Motion
Rating Distribution
- 1 stars: 0 votes
- 2 stars: 1 votes
- 3 stars: 0 votes
- 4 stars: 1 votes
- 5 stars: 1 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Digital Marketing Foundation Course
- Google Shopping Ads Digital Marketing Course
- Multi Cloud Infrastructure for beginners
- Master Lead Generation: Grow Subscribers & Sales with Popups
- Complete Copywriting System : write to sell with ease
- Product Positioning Masterclass: Unlock Market Traction
- How to Promote Your Webinar and Get More Attendees?
- Digital Marketing Courses
- Create music with Artificial Intelligence in this new market
- Create CONVERTING UGC Content So Brands Will Pay You More
- Podcast: The top 8 ways to monetize by Podcasting
- TikTok Marketing Mastery: Learn to Grow & Go Viral
- Free Digital Marketing Basics Course in Hindi
- MailChimp Free Mailing Lists: MailChimp Email Marketing
- Automate Digital Marketing & Social Media with Generative AI
- Google Ads MasterClass – All Advanced Features
- Online Course Creator: Create & Sell Online Courses Today!
- Introduction to SEO – Basic Principles of SEO
- Affiliate Marketing For Beginners: Go From Novice To Pro
- Effective Website Planning Made Simple