Back

03/2025

Reinforcement Learning – FrozenLake AI Agent

A comparison study of tabular Reinforcement Learning algorithms implemented on OpenAI Gym environments.

Benchmarks Q-Learning against classic planning baselines (value/policy iteration) with convergence and reward analysis.

Reinforcement Learning – FrozenLake AI Agent screenshot

Project Details

Technologies

Python OpenAI Gym Reinforcement Learning Q-Learning

The Problem

Understanding the convergence properties of different RL algorithms is fundamental to AI research. The challenge was to demonstrate the efficiency gap between value-based and policy-based approaches in stochastic environments.

The Solution

Implemented and compared Q-Learning, Value Iteration, and Policy Iteration on the FrozenLake-v1 environment. I visualized the Q-table heatmaps and reward convergence graphs to analyze how the agent handles "slippery" state transitions.

Architecture & Implementation

  • Environment: OpenAI Gym (Gymnasium).
  • Algorithms: Manual implementation of Bellman equations and temporal difference learning.
  • Visualization: Matplotlib for plotting reward curves and state-value heatmaps.

Results & Impact

The Value Iteration approach achieved 100% success on the non-slippery map, while the Q-learning agent demonstrated robust adaptation to the stochastic transition model of the slippery variant.