Real-World Data Science with Spark 2
Real-World Data Science with Spark 2, available at $19.99, has an average rating of 3.8, with 55 lectures, 7 quizzes, based on 21 reviews, and has 336 subscribers.
You will learn about An introduction to Big Data and data science Get to know the fundamentals of Spark 2 Understand Spark and its ecosystem of packages in data science Consolidate, clean, and transform your data acquired from various data sources Unlock the capabilities of various Spark components to perform efficient data processing, machine learning, and graph processing Dive deeper and explore various facets of data science with Spark This course is ideal for individuals who are This course is for anyone who wants to work with Spark on large and complex datasets. or Data analyst, data scientists, or Big Data architects interested to explore the data processing power of Apache Spark will find this course very useful. It is particularly useful for This course is for anyone who wants to work with Spark on large and complex datasets. or Data analyst, data scientists, or Big Data architects interested to explore the data processing power of Apache Spark will find this course very useful. .
Enroll now: Real-World Data Science with Spark 2
Summary
Title: Real-World Data Science with Spark 2
Price: $19.99
Average Rating: 3.8
Number of Lectures: 55
Number of Quizzes: 7
Number of Published Lectures: 55
Number of Published Quizzes: 7
Number of Curriculum Items: 62
Number of Published Curriculum Objects: 62
Original Price: $199.99
Quality Status: approved
Status: Live
What You Will Learn
- An introduction to Big Data and data science
- Get to know the fundamentals of Spark 2
- Understand Spark and its ecosystem of packages in data science
- Consolidate, clean, and transform your data acquired from various data sources
- Unlock the capabilities of various Spark components to perform efficient data processing, machine learning, and graph processing
- Dive deeper and explore various facets of data science with Spark
Who Should Attend
- This course is for anyone who wants to work with Spark on large and complex datasets.
- Data analyst, data scientists, or Big Data architects interested to explore the data processing power of Apache Spark will find this course very useful.
Target Audiences
- This course is for anyone who wants to work with Spark on large and complex datasets.
- Data analyst, data scientists, or Big Data architects interested to explore the data processing power of Apache Spark will find this course very useful.
Are you looking forward to expand your knowledge of performing data science operations in Spark? Or are you a data scientist who wants to understand how algorithms are implemented in Spark, or a newbie with minimal development experience and want to learn about Big Data analytics? If yes, then this course is ideal you. Let’s get on this data science journey together.
When people want a way to process Big Data at speed, Spark is invariably the solution. With its ease of development (in comparison to the relative complexity of Hadoop), it’s unsurprising that it’s becoming popularwith data analysts and engineers everywhere. It is one of the most widely-used large-scale data processing engines and runs extremely fast.
The aim of the course is to make you comfortable and confident at performing real-time data processing using Spark.
What is included?
This course is meticulously designed and developed in order to empower you with all the right and relevant information on Spark. However, I want to highlight that the road ahead may be bumpy on occasions, and some topics may be more challenging than others, but I hope that you will embrace this opportunity and focus on the reward. Remember that throughout this course, we will add many powerful techniques to your arsenal that will help us solve the problems.
Let’s take a look at the learning journey. The course begins with the basics of Spark 2 and covers the core data processing framework and API, installation, and application development setup. Then, you’ll be introduced to the Spark programming model through real-world examples. Next, you’ll learn how to collect, clean, and visualize the data coming from Twitter with Spark streaming. Then, you will get acquainted with Spark machine learning algorithms and different machine learning techniques. You will also learn to apply statistical analysis and mining operations on your dataset. The course will give you ideas on how to perform analysis including graph processing. Finally, we will take up an end-to-end case study and apply all that we have learned so far.
By the end of the course, you should be able to put your learnings into practice for faster, slicker Big Data projects.
Why should I choose this course?
Packt courses are very carefully designed to make sure that they’re delivering the best learning experience possible. This course is a blend of text, videos, code examples, and quizzes, which together makes your learning journey all the more exciting and truly rewarding. This helps you learn a range of topics at your own speed and also move towards your goal of learning the technology. We have prepared this course using extensive research and curation skills. Each section adds to the skills learned and helps you to achieve mastery of Spark.
This course is an amalgamation of sections that form a sequential flow of concepts covering a focused learning path presented in a modular manner. We have combined the best of the following Packt products:
- Data Science with Spark by Eric Charles
- Spark for Data Science by Bikramaditya Singhal and Srinivas Duvvuri
- Apache Spark 2 for Beginners by Rajanarayanan Thottuvaikkatumana
Meet your expert instructors:
For this course, we have combined the best works of these extremely esteemed authors:
Eric Charles has 10 years of experience in the field of data science and is the founder of Datalayer, a social network for data scientists. He is passionate about using software and mathematics to help companies get insights from data.
Bikramaditya Singhal is a data scientist with about 7 years of industry experience. He is an expert in statistical analysis, predictive analytics, machine learning, Bitcoin, Blockchain, and programming in C, R, and Python. He has extensive experience in building scalable data analytics solutions in many industry sectors.
Srinivas Duvvuri is currently the senior vice president development, heading the development teams for fixed income suite of products at Broadridge Financial Solutions (India) Pvt Ltd. In addition, he also leads the Big Data and Data Science COE and is the principal member of the Broadridge India Technology Council.
Rajanarayanan Thottuvaikkatumana, Raj, is a seasoned technologist with more than 23 years of software development experience at various multinational companies. He has worked on various technologies including major databases, application development platforms, web technologies, and Big Data technologies.
Course Curriculum
Chapter 1: Big Data and Data Science
Lecture 1: Course Introduction
Lecture 2: An introduction to Big Data
Chapter 2: The Spark Programming Model
Lecture 1: An overview of Apache Hadoop
Lecture 2: Understanding Apache Spark
Lecture 3: Install Spark on your laptop with Docker, or scale fast in the cloud
Lecture 4: Apache Zeppelin, a web-based notebook for Spark with matplotlib and ggplot2
Lecture 5: The RDD API
Chapter 3: Spark SQL and DataFrames
Lecture 1: Understanding the structure of data and the need of Spark SQL
Lecture 2: The DataFrame API and its operations
Chapter 4: Data Analysis on Spark
Lecture 1: Data analytics life cycle
Lecture 2: Basics of statistics
Lecture 3: Descriptive statistics
Lecture 4: Inferential statistics
Chapter 5: First Step with Spark Visualization
Lecture 1: Data visualization
Lecture 2: Manipulating data with the core RDD API
Lecture 3: Using DataFrame, dataset, and SQL – natural and easy!
Lecture 4: Manipulating rows and columns
Lecture 5: Dealing with file format
Lecture 6: Visualizing more – ggplot2, matplotlib, and Angular.js at the rescue
Lecture 7: References
Chapter 6: The Spark Machine Learning Algorithms
Lecture 1: An introduction to machine learning
Lecture 2: Discovering spark.ml and spark.mllib – and other libraries
Lecture 3: Wrapping up basic statistics and linear algebra
Lecture 4: Cleansing data and engineering the features
Lecture 5: Reducing the dimensionality
Lecture 6: Pipeline for a life
Lecture 7: References
Chapter 7: Collecting and Cleansing the Dirty Tweets
Lecture 1: Streaming tweets to disk
Lecture 2: Streaming tweets on a map
Lecture 3: Cleansing and building your reference dataset
Lecture 4: Querying and visualizing tweets with SQL
Chapter 8: Statistical Analysis on Tweets
Lecture 1: Indicators, correlations, and sampling
Lecture 2: Validating statistical relevance
Lecture 3: Running SVD and PCA
Lecture 4: Extending the basic statistics to your needs
Chapter 9: Extracting Features from the Tweets
Lecture 1: Analyzing free text from the tweets
Lecture 2: Dealing with stemming, syntax, idioms, and hashtags
Lecture 3: Detecting tweet sentiment
Lecture 4: Identifying topics with LDA
Chapter 10: Mine Data and Share Results
Lecture 1: Word cloudify your dataset
Lecture 2: Locating users and displaying heatmaps with GeoHash
Lecture 3: Collaborating on the same note with peers
Lecture 4: Create visual dashboards for your business stakeholders
Chapter 11: Classifying the Tweets
Lecture 1: Building the training and test datasets
Lecture 2: Training a logistic regression model
Lecture 3: Evaluating your classifier
Lecture 4: Selection your model
Chapter 12: Clustering Users
Lecture 1: Clustering users by followers and friends
Lecture 2: Clustering users by location
Lecture 3: Running k-means on a stream
Chapter 13: Putting It All Together
Lecture 1: Case study
Chapter 14: Data Science Applications
Lecture 1: Building data science applications
Chapter 15: Your Next Data Challenges
Lecture 1: Recommending similar users
Lecture 2: Analyzing mentions with GraphX
Lecture 3: Where to go from here
Instructors
-
Packt Publishing
Tech Knowledge in Motion
Rating Distribution
- 1 stars: 1 votes
- 2 stars: 2 votes
- 3 stars: 4 votes
- 4 stars: 6 votes
- 5 stars: 8 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024