Skip to content

Implementation of Q-learning and SARSA algorithms in the Cliff Walking environment. Explore and compare reinforcement learning techniques.

Notifications You must be signed in to change notification settings

Mahmood-Anaam/reinforcement-learning-cliff-walking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Reinforcement Learning: Cliff Walking

Open In Colab

Overview

This repository contains the implementation of two fundamental reinforcement learning algorithms, Q-learning and SARSA, applied to the Cliff Walking environment. The project explores how these algorithms learn to navigate the gridworld, avoid the cliff, and reach the goal while minimizing penalties.

Cliff Walking Environment

Key Concepts

  • Parts:

    • Q-learning: An off-policy algorithm that learns the optimal policy by estimating the maximum future rewards.
    • SARSA: An on-policy algorithm that updates its policy based on the actual actions taken, leading to potentially safer but less aggressive strategies.
    • Comparison: A detailed comparison of the paths chosen by each algorithm, highlighting differences in exploration and exploitation behaviors.
  • Tasks:

    • Implement and evaluate the Q-learning algorithm.
    • Implement and evaluate the SARSA algorithm.
    • Compare and analyze the optimal policies derived from both algorithms.

About

Implementation of Q-learning and SARSA algorithms in the Cliff Walking environment. Explore and compare reinforcement learning techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published