AI, But Simple
Posts
Fundamentals of Reinforcement Learning

Fundamentals of Reinforcement Learning

AI, But Simple Issue #9

Edwin Dong
July 08, 2024

Fundamentals of Reinforcement Learning

AI, But Simple Issue #9

Reinforcement Learning (RL) is a subset of machine learning that enables an agent to learn in an interactive environment by trial and error using feedback from its own actions and experiences.

The fundamental idea behind RL is to allow computers to learn from their mistakes.
This is very similar to human behavior, as we evolved through trial and error.

Some common applications of RL include playing games (AlphaGo, chess), robotics (BostonDynamics robots), autonomous vehicles (self-driving cars), finance-related activities (algorithmic trading), and other optimization problems.

In RL, the agent is the decision-maker who interacts with the environment to achieve a certain goal.

Unlike supervised learning, where the feedback provided to the agent (or model) are correct labels, reinforcement learning uses rewards and punishments as signals for positive and negative behavior.

In RL, there is no direct indication of the correct action, but just a positive or negative signal

In reinforcement learning, the agent’s objective is to learn a policy (defines the agent’s behavior) to maximize the amount of reward

A policy can be thought of as a strategy to determine the actions an agent will take at a specific period in time (state)

Some key terms that describe the basic elements of an RL problem are:

Environment — Physical world in which the agent operates
State — Current situation of the agent
Reward — Feedback from the environment
Policy — Method to map agent’s state to actions
Value — Future reward that an agent would receive by taking an action in a particular state

Let’s use a game example to model an RL problem. Let’s suppose we were playing the game Pac-Man, where the goal of the game is to eat the little circles without losing all of your lives.

Subscribe to keep reading

This content is free, but you must be subscribed to AI, But Simple to continue reading.

Already a subscriber?Sign In.Not now