Chapter 5: Monte Carlo Methods
June 8, 2026
Monte Carlo methods
-require only experience, not complete knowledge of the environment
-solve RL problems by averaging sampled returns
-apply only to episodic settings where episodes terminate
-extend DP ideas like policy evaluation, policy improvement, and GPI using sample experience
Monte Carlo Prediction
first-visit MC - estimate as the average of returns following the first visit to in each episode
every-visit MC - estimate by averaging returns after every visit to