Glossary term
Glossary term
Agentic Systems
In reinforcement learning, an algorithm that allows an agent to learn the optimal Q-function of a Markov decision process by applying the Bellman equation. The Markov decision process models an environment.
Created for this library
A logistics RL team uses Q-learning to train a dispatching policy that outperforms a hand-tuned heuristic in simulation.
A trading research team uses Q-learning to learn execution strategies in a market simulator before live trials.
An ad-bidding team uses Q-learning to learn the expected long-run revenue of each bid amount in real-time auctions.
Definition source: Google for Developers Machine Learning Glossary | Creative Commons Attribution 4.0 License