Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Reinforcement Learning Concepts

15

Flashcards

0/15

Still learning
StarStarStarStar

Environment

StarStarStarStar

The external system the agent interacts with, which provides states and rewards as feedback.

StarStarStarStar

State

StarStarStarStar

A representation of the situation that the agent is in at a specific time. The state space includes all possible states.

StarStarStarStar

Temporal Difference (TD) Learning

StarStarStarStar

A method where the value function is updated incrementally based on the difference between consecutive predictions, without waiting for a final outcome.

StarStarStarStar

Markov Decision Process (MDP)

StarStarStarStar

A mathematical framework for modeling decision-making in situations where outcomes are partly random and partly under the control of a decision maker.

StarStarStarStar

Exploration vs. Exploitation

StarStarStarStar

The dilemma of choosing between exploring new actions to find more information about the environment or exploiting known actions to maximize the immediate reward.

StarStarStarStar

Action

StarStarStarStar

A specific move or decision made by the agent that affects the state of the environment.

StarStarStarStar

Reinforcement Learning

StarStarStarStar

A type of machine learning where an agent learns to make decisions by performing actions and receiving rewards or penalties.

StarStarStarStar

Bellman Equation

StarStarStarStar

A recursive definition that relates the value of a policy at one state to the values at other states, providing the basis for dynamic programming approaches.

StarStarStarStar

Reward

StarStarStarStar

The feedback from the environment that quantifies the success of an agent's actions.

StarStarStarStar

Q-value (Action-Value Function)

StarStarStarStar

A function that estimates the value of taking a particular action from a particular state, based on expected future rewards.

StarStarStarStar

Deep Reinforcement Learning

StarStarStarStar

Combines deep neural networks with reinforcement learning principles to learn policies directly from high-dimensional sensory inputs.

StarStarStarStar

Agent

StarStarStarStar

The entity that makes decisions and interacts with the environment to achieve certain goals.

StarStarStarStar

Policy

StarStarStarStar

A strategy followed by the agent to determine its actions based on the current state.

StarStarStarStar

Value Function

StarStarStarStar

A function that estimates how good it is for the agent to be in a given state, in terms of expected future rewards.

StarStarStarStar

Monte Carlo Methods

StarStarStarStar

A class of computational algorithms that rely on repeated random sampling to obtain numerical results and are used to model problems that are deterministic in principle.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.