Explore tens of thousands of sets crafted by our community.
Reinforcement Learning Concepts
15
Flashcards
0/15
Environment
The external system the agent interacts with, which provides states and rewards as feedback.
State
A representation of the situation that the agent is in at a specific time. The state space includes all possible states.
Temporal Difference (TD) Learning
A method where the value function is updated incrementally based on the difference between consecutive predictions, without waiting for a final outcome.
Markov Decision Process (MDP)
A mathematical framework for modeling decision-making in situations where outcomes are partly random and partly under the control of a decision maker.
Exploration vs. Exploitation
The dilemma of choosing between exploring new actions to find more information about the environment or exploiting known actions to maximize the immediate reward.
Action
A specific move or decision made by the agent that affects the state of the environment.
Reinforcement Learning
A type of machine learning where an agent learns to make decisions by performing actions and receiving rewards or penalties.
Bellman Equation
A recursive definition that relates the value of a policy at one state to the values at other states, providing the basis for dynamic programming approaches.
Reward
The feedback from the environment that quantifies the success of an agent's actions.
Q-value (Action-Value Function)
A function that estimates the value of taking a particular action from a particular state, based on expected future rewards.
Deep Reinforcement Learning
Combines deep neural networks with reinforcement learning principles to learn policies directly from high-dimensional sensory inputs.
Agent
The entity that makes decisions and interacts with the environment to achieve certain goals.
Policy
A strategy followed by the agent to determine its actions based on the current state.
Value Function
A function that estimates how good it is for the agent to be in a given state, in terms of expected future rewards.
Monte Carlo Methods
A class of computational algorithms that rely on repeated random sampling to obtain numerical results and are used to model problems that are deterministic in principle.
© Hypatia.Tech. 2024 All rights reserved.