Modern Reinforcement Learning: Actor-Critic Agents
Modern Reinforcement Learning: Actor-Critic Agents, available at $74.99, has an average rating of 4.32, with 74 lectures, 1 quizzes, based on 482 reviews, and has 3508 subscribers.
You will learn about How to code policy gradient methods in PyTorch How to code Deep Deterministic Policy Gradients (DDPG) in PyTorch How to code Twin Delayed Deep Deterministic Policy Gradients (TD3) in PyTorch How to code actor critic algorithms in PyTorch How to implement cutting edge artificial intelligence research papers in Python This course is ideal for individuals who are Advanced students of artificial intelligence who want to implement state of the art academic research papers It is particularly useful for Advanced students of artificial intelligence who want to implement state of the art academic research papers.
Enroll now: Modern Reinforcement Learning: Actor-Critic Agents
Summary
Title: Modern Reinforcement Learning: Actor-Critic Agents
Price: $74.99
Average Rating: 4.32
Number of Lectures: 74
Number of Quizzes: 1
Number of Published Lectures: 74
Number of Published Quizzes: 1
Number of Curriculum Items: 75
Number of Published Curriculum Objects: 75
Original Price: $199.99
Quality Status: approved
Status: Live
What You Will Learn
- How to code policy gradient methods in PyTorch
- How to code Deep Deterministic Policy Gradients (DDPG) in PyTorch
- How to code Twin Delayed Deep Deterministic Policy Gradients (TD3) in PyTorch
- How to code actor critic algorithms in PyTorch
- How to implement cutting edge artificial intelligence research papers in Python
Who Should Attend
- Advanced students of artificial intelligence who want to implement state of the art academic research papers
Target Audiences
- Advanced students of artificial intelligence who want to implement state of the art academic research papers
In this advanced course on deep reinforcement learning, you will learn how to implement policy gradient, actor critic, deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and soft actor critic (SAC) algorithms in a variety of challenging environments from the Open AI gym. There will be a strong focus on dealing with environments with continuous action spaces, which is of particular interest for those looking to do research into robotic control with deep reinforcement learning.
Rather than being a course that spoon feeds the student, here you are going to learn to read deep reinforcement learning research papers on your own, and implement them from scratch. You will learn a repeatable framework for quickly implementing the algorithms in advanced research papers. Mastering the content in this course will be a quantum leap in your capabilities as an artificial intelligence engineer, and will put you in a league of your own among students who are reliant on others to break down complex ideas for them.
Fear not, if it’s been a while since your last reinforcement learning course, we will begin with a briskly paced review of core topics.
The course begins with a practical review of the fundamentals of reinforcement learning, including topics such as:
-
The Bellman Equation
-
Markov Decision Processes
-
Monte Carlo Prediction
-
Monte Carlo Control
-
Temporal Difference Prediction TD(0)
-
Temporal Difference Control with Q Learning
And moves straight into coding up our first agent: a blackjack playing artificial intelligence. From there we will progress to teaching an agent to balance the cart pole using Q learning.
After mastering the fundamentals, the pace quickens, and we move straight into an introduction to policy gradient methods.We cover the REINFORCEalgorithm, and use it to teach an artificial intelligence to land on the moon in the lunar lander environment from the Open AI gym. Next we progress to coding up the one step actor criticalgorithm, to again beat the lunar lander.
With the fundamentals out of the way, we move on to our harder projects: implementing deep reinforcement learning research papers. We will start with Deep Deterministic Policy Gradients (DDPG),which is an algorithm for teaching robots to excel at a variety of continuous control tasks. DDPG combines many of the advances of Deep Q Learning with traditional actor critic methods to achieve state of the art results in environments with continuous action spaces.
Next, we implement a state of the art artificial intelligence algorithm: Twin Delayed Deep Deterministic Policy Gradients (TD3). This algorithm sets a new benchmark for performance in continuous robotic control tasks, and we will demonstrate world class performance in the Bipedal Walker environment from the Open AI gym. TD3 is based on the DDPG algorithm, but addresses a number of approximation issues that result in poor performance in DDPG and other actor critic algorithms.
Finally, we will implement the soft actor critic algorithm (SAC).SAC approaches deep reinforcement learning from a totally different angle: by considering entropy maximization, rather than score maximization, as a viable objective. This results in increased exploration by our agent, and world class performance in a number of important Open AI Gym environments.
By the end of the course, you will know the answers to the following fundamental questions in Actor-Critic methods:
-
Why should we bother with actor critic methods when deep Q learning is so successful?
-
Can the advances in deep Q learning be used in other fields of reinforcement learning?
-
How can we solve the explore-exploit dilemma with a deterministic policy?
-
How do we get and deal with overestimation bias in actor-critic methods?
-
How do we deal with the inherent approximation errors in deep neural networks?
This course is for the highly motivated and advanced student. To succeed, you must have prior course work in all the following topics:
-
College level calculus
-
Reinforcement learning
-
Deep learning
The pace of the course is brisk and the topics are at the cutting edge of deep reinforcement learning research, but the payoff is that you will come out knowing how to read research papers and turn them into functional code as quickly as possible. You’ll never have to rely on dodgy medium blog posts again.
Course Curriculum
Chapter 1: Introduction
Lecture 1: What You Will Learn in this Course
Lecture 2: Required Background, Software, and Hardware
Lecture 3: How to Succeed in this Course
Chapter 2: Fundamentals of Reinforcement Learning
Lecture 1: Review of Fundamental Concepts
Lecture 2: Teaching an AI about Black Jack with Monte Carlo Prediction
Lecture 3: Teaching an AI How to Play Black Jack with Monte Carlo Control
Lecture 4: Review of Temporal Difference Learning Methods
Lecture 5: Teaching an AI about Balance with TD(0) Prediction
Lecture 6: Teaching an AI to Balance the Cart Pole with Q Learning
Chapter 3: Landing on the Moon with Policy Gradients & Actor Critic Methods
Lecture 1: What's so Great About Policy Gradient Methods?
Lecture 2: Combining Neural Networks with Monte Carlo: REINFORCE Policy Gradient Algorithm
Lecture 3: Introducing the Lunar Lander Environment
Lecture 4: Coding the Agent's Brain: The Policy Gradient Network
Lecture 5: Coding the Policy Gradient Agent's Basic Functionality
Lecture 6: Coding the Agent's Learn Function
Lecture 7: Coding the Policy Gradient Main Loop and Watching our Agent Land on the Moon
Lecture 8: Actor Critic Learning: Combining Policy Gradients & Temporal Difference Learning
Lecture 9: Coding the Actor Critic Networks
Lecture 10: Coding the Actor Critic Agent
Lecture 11: Coding the Actor Critic Main Loop and Watching Our Agent Land on the Moon
Chapter 4: Deep Deterministic Policy Gradients (DDPG): Actor Critic with Continuous Actions
Lecture 1: Getting up to Speed With Deep Q Learning
Lecture 2: How to Read and Understand Cutting Edge Research Papers
Lecture 3: Analyzing the DDPG Paper Abstract and Introduction
Lecture 4: Analyzing the Background Material
Lecture 5: What Algorithm Are We Going to Implement?
Lecture 6: What Results Should We Expect?
Lecture 7: What Other Solutions are Out There?
Lecture 8: What Model Architecture and Hyperparameters Do We Need?
Lecture 9: Handling the Explore-Exploit Dilemma: Coding the OU Action Noise Class
Lecture 10: Giving our Agent a Memory: Coding the Replay Memory Buffer Class
Lecture 11: Deep Q Learning for Actor Critic Methods: Coding the Critic Network Class
Lecture 12: Coding the Actor Network Class
Lecture 13: Giving our DDPG Agent Simple Autonomy: Coding the Basic Functions of Our Agent
Lecture 14: Giving our DDPG Agent a Brain: Coding the Agent's Learn Function
Lecture 15: Coding the Network Parameter Update Functionality
Lecture 16: Coding the Main Loop and Watching Our DDPG Agent Land on the Moon
Chapter 5: Twin Delayed Deep Deterministic Policy Gradients (TD3)
Lecture 1: Some Tips on Reading this Paper
Lecture 2: Analyzing the TD3 Paper Abstract and Introduction
Lecture 3: What Other Solutions Have People Tried?
Lecture 4: Reviewing the Fundamental Concepts
Lecture 5: Is Overestimation Bias Even a Problem in Actor-Critic Methods?
Lecture 6: Why is Variance a Problem for Actor-Critic Methods?
Lecture 7: What Results Can We Expect?
Lecture 8: Coding the Brains of the TD3 Agent – The Actor and Critic Network Classes
Lecture 9: Giving our TD3 Agent Simple Autonomy – Coding the Basic Agent Functionality
Lecture 10: Giving our TD3 Agent a Brain – Coding the Learn Function
Lecture 11: Coding the Network Parameter Update Functionality
Lecture 12: Coding the Main Loop And Watching our Agent Learn to Walk
Chapter 6: Soft Actor Critic
Lecture 1: A Quick Word on the Paper
Lecture 2: Getting Acquainted With a New Framework
Lecture 3: Checking Out What Has Been Done Before
Lecture 4: Inspecting the Foundation of this New Framework
Lecture 5: Digging Into the Mathematics of Soft Actor Critic
Lecture 6: Seeing How the New Algorithm Measures Up
Lecture 7: Coding the Neural Networks
Lecture 8: Coding the Soft Actor Critic Basic Functionality
Lecture 9: Coding the Soft Actor Critic Algorithm
Lecture 10: Coding the Main Loop and Evaluating Our Agent
Chapter 7: Tensorflow 2 Implementation
Lecture 1: Coding the Policy Gradient Network in Tensorflow 2
Lecture 2: Coding the REINFORCE Agent in Tensorflow 2
Lecture 3: Coding the REINFORCE Main Loop and Evaluating our Agent
Lecture 4: Coding the Actor Critic Network in Tensorflow 2
Lecture 5: Coding the Actor Critic Agent in Tensorflow 2
Lecture 6: Coding the Actor Critic Main Program and Evaluating our Agent
Lecture 7: Coding the DDPG Networks in Tensorflow 2
Lecture 8: Coding the DDPG Agent in Tensorflow 2
Lecture 9: Coding the DDPG Main Program and Evaluating our Agent
Lecture 10: Coding the TD3 Agent in Tensorflow 2
Lecture 11: Coding the TD3 Main Program and Evaluating our Agent
Lecture 12: Coding the SAC Networks in Tensorflow 2
Lecture 13: Coding the SAC Agent in Tensorflow 2
Lecture 14: Coding the SAC Main Function and Evaluating our Agent
Chapter 8: Appendix
Lecture 1: Setting Up Our Virtual Environment for the New OpenAI Gym
Lecture 2: Making our Agents Compliant With the New Gym Interface
Instructors
-
Phil Tabor
Machine Learning Engineer
Rating Distribution
- 1 stars: 8 votes
- 2 stars: 18 votes
- 3 stars: 39 votes
- 4 stars: 152 votes
- 5 stars: 265 votes
Frequently Asked Questions
How long do I have access to the course materials?
You can view and review the lecture materials indefinitely, like an on-demand channel.
Can I take my courses with me wherever I go?
Definitely! If you have an internet connection, courses on Udemy are available on any device at any time. If you don’t have an internet connection, some instructors also let their students download course lectures. That’s up to the instructor though, so make sure you get on their good side!
You may also like
- Top 10 Video Editing Courses to Learn in November 2024
- Top 10 Music Production Courses to Learn in November 2024
- Top 10 Animation Courses to Learn in November 2024
- Top 10 Digital Illustration Courses to Learn in November 2024
- Top 10 Renewable Energy Courses to Learn in November 2024
- Top 10 Sustainable Living Courses to Learn in November 2024
- Top 10 Ethical AI Courses to Learn in November 2024
- Top 10 Cybersecurity Fundamentals Courses to Learn in November 2024
- Top 10 Smart Home Technology Courses to Learn in November 2024
- Top 10 Holistic Health Courses to Learn in November 2024
- Top 10 Nutrition And Diet Planning Courses to Learn in November 2024
- Top 10 Yoga Instruction Courses to Learn in November 2024
- Top 10 Stress Management Courses to Learn in November 2024
- Top 10 Mindfulness Meditation Courses to Learn in November 2024
- Top 10 Life Coaching Courses to Learn in November 2024
- Top 10 Career Development Courses to Learn in November 2024
- Top 10 Relationship Building Courses to Learn in November 2024
- Top 10 Parenting Skills Courses to Learn in November 2024
- Top 10 Home Improvement Courses to Learn in November 2024
- Top 10 Gardening Courses to Learn in November 2024