💡 Phase 2 — Value-Based Methods
Learn how to estimate optimal action-value functions.
Topics
- ε-Greedy Exploration
- SARSA (On-Policy TD Control)
- Q-Learning (Off-Policy TD Control)
- Experience Replay
- Function Approximation (Linear / Neural Networks)
Mini Projects
- FrozenLake-v1 (Tabular Q-Learning)
- CartPole-v1 (Neural Q-Learning)
📁 Source folder:
02-Value-Based