Real World Hadoop – Hands on Enterprise Distributed Storage.
Real World Hadoop – Hands on Enterprise Distributed Storage., available at $19.99, has an average rating of 4.55, with 18 lectures, based on 16 reviews, and has 302 subscribers.
You will learn about Learn how to navigate the HDFS file system If you want to build a HDFS stack, Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment Quickly build an environment where Cloudera and HDFS software can be installed. Ability to automate the installation of software across multiple Virtual Machines This course is ideal for individuals who are Software engineers who want to expand their skills into the world of distributed computing or System Engineers that want to expand their skillsets beyond the single server or Developers who want to write/test their code against a valid distributed enviroment It is particularly useful for Software engineers who want to expand their skills into the world of distributed computing or System Engineers that want to expand their skillsets beyond the single server or Developers who want to write/test their code against a valid distributed enviroment.
Enroll now: Real World Hadoop – Hands on Enterprise Distributed Storage.
Summary
Title: Real World Hadoop – Hands on Enterprise Distributed Storage.
Price: $19.99
Average Rating: 4.55
Number of Lectures: 18
Number of Published Lectures: 18
Number of Curriculum Items: 18
Number of Published Curriculum Objects: 18
Original Price: £89.99
Quality Status: approved
Status: Live
What You Will Learn
- Learn how to navigate the HDFS file system
- If you want to build a HDFS stack, Simply run a single command on your desktop, go for a coffee, and come back with a running distributed environment for cluster deployment
- Quickly build an environment where Cloudera and HDFS software can be installed.
- Ability to automate the installation of software across multiple Virtual Machines
Who Should Attend
- Software engineers who want to expand their skills into the world of distributed computing
- System Engineers that want to expand their skillsets beyond the single server
- Developers who want to write/test their code against a valid distributed enviroment
Target Audiences
- Software engineers who want to expand their skills into the world of distributed computing
- System Engineers that want to expand their skillsets beyond the single server
- Developers who want to write/test their code against a valid distributed enviroment
The Hadoop Distributed File System
(HDFS) is a distributed file system designed to run on commodity
hardware. It has many similarities with existing distributed file
systems.
We will be manipulating the HDFS File System, however why are Enterprises interested in HDFS to begin with?
However, the differences from other distributed file systems are significant.
HDFSis highly fault-tolerant and is designed to be deployed on low-cost hardware.
HDFSprovides high throughput access to application data and is suitable for applications that have large data sets.
HDFSrelaxes a few POSIX requirements to enable streaming access to file system data.
HDFSis part of the Apache Hadoop Core project.
Hardware failure is the norm rather than the exception. An HDFSinstance may consist of hundreds or thousands of server machines, each storing part of the file system’s data. The fact that there are a huge number of components and that each component has a non-trivial probability of failure means that some component of HDFSis always non-functional. Therefore, detection of faults and quick, automatic recovery from them is a core architectural goal of HDFS.
Applications that run on HDFShave large data sets. A typical file in HDFSis gigabytes to terabytes in size. Thus, HDFSis tuned to support large files. It should provide high aggregate data bandwidth and scale to hundreds of nodes in a single cluster. It should support tens of millions of files in a single instance.
A computation requested by an application is much more efficient if it is executed near the data it operates on. This is especially true when the size of the data set is huge. This minimizes network congestion and increases the overall throughput of the system. The assumption is that
it is often better to migrate the computation closer to where the data is located rather than moving the data to where the application is running. HDFSprovides interfaces for applications to move themselves closer to where the data is located.
.
Here I present a curriculum as to the current state of my Cloudera courses.
My Hadoop courses are based on Vagrant so that you can practice and
destroy your virtual environment before applying the installation onto
real servers/VMs.
.
For those with little or no knowledge of the Hadoop eco system
Udemy course : Big Data Intro for IT Administrators, Devs and Consultants
.
I would first practice with Vagrant so that you can carve out a
virtual environment on your local desktop. You don’t want to corrupt
your physical servers if you do not understand the steps or make a
mistake.
Udemy course : Real World Vagrant For Distributed Computing
.
I would then, on the virtual servers, deploy Cloudera Manager plus
agents. Agents are the guys that will sit on all the slave nodes ready
to deploy your Hadoop services
Udemy course : Real World Vagrant – Automate a Cloudera Manager Build
.
Then deploy the Hadoop services across your cluster (via the
installed Cloudera Manager in the previous step). We look at the logic
regarding the placement of master and slave services.
Udemy course : Real World Hadoop – Deploying Hadoop with Cloudera Manager
.
If you want to play around with HDFS commands (Hands on distributed file manipulation).
Udemy course : Real World Hadoop – Hands on Enterprise Distributed Storage.
.
You can also automate the deployment of the Hadoop services via
Python (using the Cloudera Manager Python API). But this is an advanced
step and thus I would make sure that you understand how to manually
deploy the Hadoop services first.
Udemy course : Real World Hadoop – Automating Hadoop install with Python!
.
There is also the upgrade step. Once you have a running cluster, how
do you upgrade to a newer hadoop cluster (Both for Cloudera Manager and
the Hadoop Services).
Udemy course : Real World Hadoop – Upgrade Cloudera and Hadoop hands on
Course Curriculum
Chapter 1: Navigating the HDFS File System
Lecture 1: Justification
Lecture 2: Suggested course curriculum to follow …
Chapter 2: HDFS Theory and Installation
Lecture 1: Walking through the topology and benefits of HDFS
Lecture 2: Here we break down how HDFS can be installed
Lecture 3: We step through the installation of HDFS using Cloudera Manager
Chapter 3: Navigating the Distributed Storage using – hdfs dfs
Lecture 1: Comparing "hdfs dfs" commands to our regular "bash" commands
Lecture 2: Creating a userspace within hdfs for users to read/write files
Lecture 3: Here we upload a file into HDFS and view some details
Lecture 4: It sounds naughty, but here we look at – hdfs fsck
Lecture 5: Here we look at hdfs – ls, rm and expunge
Chapter 4: OK, so how can one Add or Remove Files within a Distributed System.
Lecture 1: We take a closer look at deleting files along with the skipTrash option
Lecture 2: Here we look at the hdfs commands – mkdir, appendToFile, cat and tail
Lecture 3: Here we learn to search for files within hdfs
Lecture 4: Here we look at the hdfs "get" and "getmerge" commands
Chapter 5: So how easy (or hard!) is it to Manage a Large Distributed Cluster
Lecture 1: Here we look at how we can count files and directories within hdfs
Lecture 2: Here we look at how we can copy and move files within hdfs
Lecture 3: Here we combine touchz and appendToFile to simulate increasing DataSet size
Chapter 6: Conclusion
Lecture 1: Conclusion
Instructors
-
Toyin Akin
Big Data Engineer, Capital Markets FinTech Developer
Rating Distribution
- 1 stars: 0 votes
- 2 stars: 1 votes
- 3 stars: 2 votes
- 4 stars: 2 votes
- 5 stars: 11 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024