This repository contains the implementation of my reinforcement learning final project for COMP579 at McGill University during the 2025 Winter Term. The project investigates the impact of temporal information on reinforcement learning algorithms by comparing how PPO, DQN, and A2C leverage stacked frames as input. The study focuses on the Breakout environment, analyzing training rewards, sample efficiency, and final performance across different configurations. Below is a description of the files and instructions on how to set up and run the project using Poetry.
train_eval.py: Contains the training and evaluation loop with all customizable parameters defined inside.callback.py: Implements a custom callback function for logging during training.visualization.py: Provides functions to plot graphs and visualize experiment results.custom_logger.py: A logger utility to print messages both to the console and to a file.demo.py: A demonstration script with human rendering of the agent's behavior.
This project uses Poetry for dependency management. Follow the steps below to set up the environment:
-
Clone the repository:
git clone git@github.com:Niamorine/RL_COMP579.git cd RL_COMP579 -
Install Poetry if not already installed:
pip install poetry
-
Install the project dependencies:
poetry install
You can change the parameters used for the training and evaluation at the bottom of the file train_eval.py.
To run the training and evaluation loop:
poetry run python train_eval.pyTo generate plots of experiment results:
poetry run python visualization.pyTo run the demo with human rendering of the agent (first change the model and model path accordingly to yours in the script):
poetry run python demo.pyLogs are managed using custom_logger.py utility, which outputs messages to both the console and a log file for easier debugging and tracking.
You can customize training parameters directly in train_eval.py and modify logging behavior in callback.py and custom_logger.py.