Apache Spark for Big Data Analytics and Data Processing
Apache Spark for Big Data Analytics and Data Processing, available at $19.99, has an average rating of 4, with 90 lectures, 3 quizzes, based on 3 reviews, and has 77 subscribers.
You will learn about Query your structured data using Spark SQL and work with the DataSets API Analyze and process graph structures using Spark’s GraphX module Train machine learning models with streaming data, and use them for making real-time predictions Implement high-velocity streaming and data processing use cases while working with streaming API Dive into MLlib– the machine learning functional library in Spark with highly scalable algorithm See how SparkR allows to create and transform RDDs in R See analytical use case implementations using MLLib, GraphX, and Spark streaming Examine a number of real-world use cases with hands-on projects Build Hadoop and Apache Spark jobs that process data quickly and effectively This course is ideal for individuals who are This course is for software engineers, data scientists, big data developers, and big data analysts who are interested in big data processing and data analytics with Apache Spark. It is particularly useful for This course is for software engineers, data scientists, big data developers, and big data analysts who are interested in big data processing and data analytics with Apache Spark.
Enroll now: Apache Spark for Big Data Analytics and Data Processing
Summary
Title: Apache Spark for Big Data Analytics and Data Processing
Price: $19.99
Average Rating: 4
Number of Lectures: 90
Number of Quizzes: 3
Number of Published Lectures: 90
Number of Published Quizzes: 3
Number of Curriculum Items: 93
Number of Published Curriculum Objects: 93
Original Price: $199.99
Quality Status: approved
Status: Live
What You Will Learn
- Query your structured data using Spark SQL and work with the DataSets API
- Analyze and process graph structures using Spark’s GraphX module
- Train machine learning models with streaming data, and use them for making real-time predictions
- Implement high-velocity streaming and data processing use cases while working with streaming API
- Dive into MLlib– the machine learning functional library in Spark with highly scalable algorithm
- See how SparkR allows to create and transform RDDs in R
- See analytical use case implementations using MLLib, GraphX, and Spark streaming
- Examine a number of real-world use cases with hands-on projects
- Build Hadoop and Apache Spark jobs that process data quickly and effectively
Who Should Attend
- This course is for software engineers, data scientists, big data developers, and big data analysts who are interested in big data processing and data analytics with Apache Spark.
Target Audiences
- This course is for software engineers, data scientists, big data developers, and big data analysts who are interested in big data processing and data analytics with Apache Spark.
Today’s world witnesses a massive amount of data being generated everyday, everywhere. As a result, a number of organizations are focusing on Big Data processing to process large amounts of data in real-time with maximum efficiency. This has led to Apache Spark gaining popularity in the Big Data market rapidly. If you want to get the most out of the trending Big Data framework for all your data processing needs, then go for this Learning Path.
This comprehensive 3-in-1 course focuses on performing data streaming and data analytics with Apache Spark. You will learn to load data from a variety of structured sources such as JSON, Hive, and Parquet using Spark SQL and schema RDDs. You will also build streaming applications and learn best practices for managing high-velocity streaming and external data sources. Next, you will explore Spark machine learning libraries and GraphX where you will perform graphical processing and analysis. Finally, you will build projects which will help you put your learnings into practice and get a strong hold of the topic.
Contents and Overview
This training program includes 3 complete courses, carefully chosen to give you the most comprehensive training possible.
The first course, Spark Analytics for Real-Time Data Processing, starts off with explaining Spark SQL. You will learn how to use the Spark SQL API and built-in functions with Apache Spark. You will also go through some interactive analysis and look at some integrations between Spark and Java/Scala/Python. Next, you will explore Spark Streaming, streamingcontext, and DStreams. You will learn how Spark streaming works on top of the Spark core, thus inheriting its features. Finally, you will stream data and also learn best practices for managing high-velocity streaming and external data sources.
In the second course, Advanced Analytics and Real-Time Data Processing in Apache Spark, you will leverage the features of various components of the Spark framework to efficiently process, analyze, and visualize your data. You will then learn how to implement the high velocity streaming operation for data processing in order to perform efficient analytics on your real-time data. You will also analyze data using machine learning techniques and graphs. Next, you will learn to solve problems using machine learning techniques and find out about all the tools available in the MLlib toolkit. Finally, you will see some useful machine learning algorithms with the help of Spark MLlib and will integrate Spark with R.
The third course, Big Data Analytics Projects with Apache Spark, contains various projects that consist of real-world examples. The first project is to find top selling products for an e-commerce business by efficiently joining data sets in the Mapreduce paradigm. Next, a Market Basket Analysis will help you identify items likely to be purchased together and find correlations between items in a set of transactions. Moving on, you will learn about probabilistic logistic regression by finding an author for a post. Next, you will build a content-based recommendation system for movies to predict whether an action will happen, which you will do by building a trained model. Finally, you will use the Mapreduce Spark program to calculate mutual friends on social network.
By the end of this course, you will have a sound understanding of the Spark framework, which will help you in analyzing and processing big data in real time.
Meet Your Expert(s):
We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:
-
Nishant Garg has over 17 years of software architecture and development experience in various technologies, such as Java Enterprise Edition, SOA, Spring, Hadoop, Hive, Flume, Sqoop, Oozie, Spark, Shark, YARN, Impala, Kafka, Storm, Solr/Lucene, NoSQL databases (such as HBase, Cassandra, and MongoDB), and MPP databases (such as GreenPlum). He received his MS in software systems from the Birla Institute of Technology and Science, Pilani, India, and is currently working as a technical architect for the Big Data RandD Group with Impetus Infotech Pvt. Ltd. Previously, Nishant has enjoyed working with some of the most recognizable names in IT services and financial industries, employing full software life cycle methodologies such as Agile and SCRUM. Nishant has also undertaken many speaking engagements on big data technologies and is also the author of Apache Kafka and HBase Essentials, Packt Publishing.
-
Tomasz Lelek is a Software Engineer and Co-Founder of InitLearn. He mostly does programming in Java and Scala. He dedicates his time and effort to get better at everything. He is currently diving into Big Data technologies. Tomasz is very passionate about everything associated with software development. He has been a speaker at a few conferences in Poland-Confitura and JDD, and at the Krakow Scala User Group. He has also conducted a live coding session at Geecon Conference. He was also a speaker at an international event in Dhaka. He is very enthusiastic and loves to share his knowledge.
Course Curriculum
Chapter 1: Spark Analytics for Real-Time Data Processing
Lecture 1: The course overview
Lecture 2: Spark SQL Introduction
Lecture 3: Spark SQL – Core Abstractions
Lecture 4: Creating DataFrames from RDD
Lecture 5: Creating DataFrames from Files
Lecture 6: Creating DataFrames from Data Sources
Lecture 7: DataFrame API – Common Operations
Lecture 8: DataFrame API – Query Operations
Lecture 9: DataFrame API – Actions
Lecture 10: DataFrame API – Built-In Functions
Lecture 11: Spark Streaming – Introduction
Lecture 12: Spark Streaming – Quick Example
Lecture 13: Spark Streaming – Architecture
Lecture 14: Spark Streaming – Transformations
Lecture 15: Spark Streaming – Input Sources
Lecture 16: Spark Streaming – Performance Considerations
Lecture 17: Best Practices for High Velocity Streams
Lecture 18: Best Practices for External Data Sources
Lecture 19: Design Patterns
Chapter 2: Advanced Analytics and Real-Time Data Processing in Apache Spark
Lecture 1: The Course Overview
Lecture 2: Introducing Spark Streaming
Lecture 3: Streaming Context
Lecture 4: Processing Streaming Data
Lecture 5: Use Cases
Lecture 6: Spark Streaming Word Count Hands-On
Lecture 7: Spark Streaming – Understanding Master URL
Lecture 8: Integrating Spark Streaming with Apache Kafka
Lecture 9: mapWithState Operation
Lecture 10: Transform and Window Operation
Lecture 11: Join and Output Operations
Lecture 12: Output Operations –Saving Results to Kafka Sink
Lecture 13: Handling Time in High Velocity Streams
Lecture 14: Connecting External Systems That Works in At Least Once Guarantee – Deduplicaion
Lecture 15: Building Streaming Application –Handling Events That Are Not in Order
Lecture 16: Filtering Bots from Stream of Page View Events
Lecture 17: Introducing Machine Learning with Spark
Lecture 18: Feature Extraction and Transformation
Lecture 19: Transforming Text into Vector of Numbers – ML Bag-of-Words Technique
Lecture 20: Logistic Regression
Lecture 21: Model Evaluation
Lecture 22: Clustering
Lecture 23: Gaussian Mixture Models
Lecture 24: Principal Component Analysis and Distributing the Singular Value Decomposition
Lecture 25: Collaborative Filtering – Building Recommendation Engine
Lecture 26: Introducing Spark GraphX–How to Represent a Graph?
Lecture 27: Limitations of Graph-Parallel System – Why Spark GraphX?
Lecture 28: Importing GraphX
Lecture 29: Create a Graph Using GraphX and Property Graph
Lecture 30: List of Operators
Lecture 31: Perform Graph Operations Using GraphX
Lecture 32: Triplet View
Lecture 33: Perform Subgraph Operations
Lecture 34: Neighbourhood Aggregations – Collecting Neighbours
Lecture 35: Counting Degree of Vertex
Lecture 36: Caching and Uncaching
Lecture 37: GraphBuilder
Lecture 38: Vertex and Edge RDD
Lecture 39: Structural Operators – Connected Components
Lecture 40: Introduction to SparkR and How It’s Used?
Lecture 41: Setting Up from RStudio
Lecture 42: Creating Spark DataFrames from Data Sources
Lecture 43: SparkDataFrames Operations – Grouping, Aggregation
Lecture 44: Run a Given Function on a Large Dataset Using dapply or dapplyCollect
Lecture 45: Running Large Dataset by Input Column(s) and Using gapply or gapplyCollect
Lecture 46: Run Local R Functions Distributed Using spark.lapply
Lecture 47: Running SQL Queries from SparkR
Lecture 48: PageRank Using Spark GraphX
Lecture 49: Sending Real-Time Notification to User on an E-Commerce site
Chapter 3: Big Data Analytics Projects with Apache Spark
Lecture 1: The Course Overview
Lecture 2: Explaining Ways of Joining Datasets
Lecture 3: Developing Spark Algorithm for Joining/Windowing Datasets
Lecture 4: Testing Logic in MapReduce Spark — Finding Top Sellers
Lecture 5: Drawing Conclusions from Top Sellers Data
Lecture 6: Market Basket Analysis Goals
Lecture 7: Where MBA Algorithms Are Useful?
Lecture 8: Implementing MBA MapReduce Algorithm in Spark
Lecture 9: Finding Association Rules Between Products
Lecture 10: Analyzing Post for an Author
Lecture 11: Extracting Information from Unstructured Text
Lecture 12: Extracting Information via Spark DataFrame
Lecture 13: Sentiment Analysis of Posts Using Logistic Regression
Lecture 14: Finding an Author of a Post
Lecture 15: Content-Based Recommendation Systems Explanation
Lecture 16: Finding Correlation Between Movies and Users
Lecture 17: Testing Logic in MapReduce Spark
Lecture 18: Finding Recommendation for Given User
Lecture 19: Finding Common Friends Problem — Graph Approach
Lecture 20: Creating a Graph Using GraphX and Property Graph
Lecture 21: Solution — Examining Available Methods
Lecture 22: Finding Closest Friend for Given User Using Page Rank
Instructors
-
Packt Publishing
Tech Knowledge in Motion
Rating Distribution
- 1 stars: 0 votes
- 2 stars: 0 votes
- 3 stars: 1 votes
- 4 stars: 1 votes
- 5 stars: 1 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024