Learning Path: Spark: Data Science with Apache Spark
Learning Path: Spark: Data Science with Apache Spark, available at $19.99, has an average rating of 3.17, with 53 lectures, 2 quizzes, based on 9 reviews, and has 98 subscribers.
You will learn about Understand the Spark API and its architecture Know the difference between RDD and the dataframe API Discover how to write efficient jobs using Apache Spark Learn to test Spark code correctly Understand the Spark programming language and its ecosystem of packages in Data Science Learn Spark machine learning algorithm to build a simple pipeline Apply data mining techniques on the available data sets Build a recommendation engine This course is ideal for individuals who are This Learning Path is for anyone who is interested in big data processing and want to work with Spark on large and complex data sets. It is particularly useful for This Learning Path is for anyone who is interested in big data processing and want to work with Spark on large and complex data sets.
Enroll now: Learning Path: Spark: Data Science with Apache Spark
Summary
Title: Learning Path: Spark: Data Science with Apache Spark
Price: $19.99
Average Rating: 3.17
Number of Lectures: 53
Number of Quizzes: 2
Number of Published Lectures: 53
Number of Published Quizzes: 2
Number of Curriculum Items: 55
Number of Published Curriculum Objects: 55
Original Price: $199.99
Quality Status: approved
Status: Live
What You Will Learn
- Understand the Spark API and its architecture
- Know the difference between RDD and the dataframe API
- Discover how to write efficient jobs using Apache Spark
- Learn to test Spark code correctly
- Understand the Spark programming language and its ecosystem of packages in Data Science
- Learn Spark machine learning algorithm to build a simple pipeline
- Apply data mining techniques on the available data sets
- Build a recommendation engine
Who Should Attend
- This Learning Path is for anyone who is interested in big data processing and want to work with Spark on large and complex data sets.
Target Audiences
- This Learning Path is for anyone who is interested in big data processing and want to work with Spark on large and complex data sets.
Every year a large amount of data is generated which needs to be stored and analyzed. Apache Spark allows you to process such big data. The real power and value proposition of Apache Spark is its speed and platform to execute data science tasks. Spark’s unique use case is that it combines ETL, batch analytic, real-time stream analysis, machine learning, graph processing, and visualizations to allow data scientists to tackle the complexities that come with raw unstructured data sets. Spark embraces this approach and has the vision to make the transition from working on a single machine to working on a cluster, something that makes data science tasks a lot more agile. So, if you’re interested to learn big data processing and execute data science tasks efficiently, then go for this Learning Path.
Packt’s Video Learning Path is a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it.
The highlights of this Learning Path are:
- Explore the Apache Spark architecture and delve into its API and key features
- Implement efficient big data processing
- Write code that is maintainable and easy to test
- Explore various facets of data science with Spark
- Get up and running with Apache Spark and clean, analyze, and visualize data with ease
Let’s take a quick look at your learning journey. This Learning Path starts off by explaining the basics of Spark API and its architecture in detail. You will then learn about data mining and data cleaning. You will also learn to analyze data by writing actual jobs. Next, you will learn the needed steps to build machine learning applications. You will also explore machine learning algorithms and different machine learning techniques. Further, you will learn to collect, clean, and visualize data coming from Twitter with Spark streaming. Finally, you will understand how to perform analysis including graph processing.
By the end of this Learning Path, you will be able to do all your data science tasks in a very visual way, comprehensive and appealing for business and other stakeholders.
Meet Your Experts:
We have the best works of the following esteemed authors to ensure that your learning journey is smooth:
Tomasz Lelek is a Software Engineer, programmer mostly in Java and Scala. He is a fan of microservices architecture, and functional programming. He recently dived into big data technologies such as Apache Spark and Hadoop.
Eric Charles has 10 years of experience in the field of Data Science and is the founder of Datalayer, a social network for Data Scientists. He is passionate about using software and mathematics to help companies get insights from data. His typical day includes building efficient processing with advanced machine learning algorithms, easy SQL, streaming, and graph analytics. He also focuses a lot on visualization and result sharing. He is passionate about open-source technologies and is an active Apache Member.
Course Curriculum
Chapter 1: Big Data Processing using Apache Spark
Lecture 1: The Course Overview
Lecture 2: Overview of the Apache Spark and Its Architecture
Lecture 3: Start a Project Using Apache Spark, Look at build.sbt
Lecture 4: Creating the Spark Context
Lecture 5: Looking at API of Spark
Lecture 6: Looking at the Input Data Structure
Lecture 7: Using RDD API in the Data Mining Process
Lecture 8: Loading Input Data
Lecture 9: Cleaning Input Data
Lecture 10: Logic for Counting Words
Lecture 11: Using RDD API Transformations and Actions to Solve a Problem
Lecture 12: Testing Spark Job
Lecture 13: Summary of Data Processing
Chapter 2: Data Science with Spark
Lecture 1: The Course Overview
Lecture 2: Spark: Origins & Ecosystem for Big Data Scientists, the Scala, Python & R flavor
Lecture 3: Install Spark on Your Laptop with Docker, or Scale Fast in the Cloud
Lecture 4: Apache Zeppelin, a Web-Based Notebook for Spark with matplotlib and ggplot2
Lecture 5: Manipulating Data with the Core RDD API
Lecture 6: Using Dataframe, Dataset, and SQL – Natural and Easy!
Lecture 7: Manipulating Rows and Columns
Lecture 8: Dealing with File Format
Lecture 9: Visualizing More – ggplot2, matplotlib, and Angular.js at the Rescue
Lecture 10: Discovering spark.ml and spark.mllib – and Other Libraries
Lecture 11: Wrapping Up Basic Statistics and Linear Algebra
Lecture 12: Cleansing Data and Engineering the Features
Lecture 13: Reducing the Dimensionality
Lecture 14: Pipeline for a Life
Lecture 15: Streaming Tweets to Disk
Lecture 16: Streaming Tweets on a Map
Lecture 17: Cleansing and Building Your Reference Dataset
Lecture 18: Querying and Visualizing Tweets with SQL
Lecture 19: Indicators, Correlations, and Sampling
Lecture 20: Validating Statistical Relevance
Lecture 21: Running SVD and PCA
Lecture 22: Extending the Basic Statistics for Your Needs
Lecture 23: Analyzing Free Text from the Tweets
Lecture 24: Dealing with Stemming, Syntax, Idioms and Hashtags
Lecture 25: Detecting Tweet Sentiment
Lecture 26: Identifying Topics with LDA
Lecture 27: Word Cloudify Your Dataset
Lecture 28: Locating Users and Displaying Heatmaps with GeoHash
Lecture 29: Collaborating on the Same Note with Peers
Lecture 30: Create Visual Dashboards for Your Business Stakeholders
Lecture 31: Building the Training and Test Datasets
Lecture 32: Training a Logistic Regression Model
Lecture 33: Evaluating Your Classifier
Lecture 34: Selecting Your Model
Lecture 35: Clustering Users by Followers and Friends
Lecture 36: Clustering Users by Location
Lecture 37: Running KMeans on a Stream
Lecture 38: Recommending Similar Users
Lecture 39: Analyzing Mentions with GraphX
Lecture 40: Where to Go from Here
Instructors
-
Packt Publishing
Tech Knowledge in Motion
Rating Distribution
- 1 stars: 1 votes
- 2 stars: 1 votes
- 3 stars: 5 votes
- 4 stars: 2 votes
- 5 stars: 0 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024