Skip to content

This repo empirically investigates that Weight Agnostic Neural Networks (WANN) are a promising alternative to traditional RL models in sparse settings.

License

Notifications You must be signed in to change notification settings

Tobi-Tob/Sparse-RL-Wann

Folders and files

NameName
Last commit message
Last commit date

Latest commit

92f2e69 Β· Feb 13, 2025

History

70 Commits
Jan 30, 2025
Feb 13, 2025
Feb 4, 2025
Feb 13, 2025
Jan 30, 2025
Jan 28, 2025
Jan 27, 2025
Dec 29, 2024
Feb 13, 2025
Jan 30, 2025
Feb 13, 2025
Jan 30, 2025
Feb 13, 2025
Feb 13, 2025
Feb 13, 2025
Jan 27, 2025

Repository files navigation

Exploring Sparse Reward Environments with Weight Agnostic Neural Networks

Python

πŸš€ Overview

Sparse reward environments present a significant challenge in reinforcement learning (RL), as agents receive little to no feedback for extended periods, making effective learning difficult. Traditional RL algorithms struggle in these settings without human-engineered feedback to guide training.

Why Sparse Rewards Matter?

Many real-world RL applications provide only sparse rewards, requiring the algorithm to find a "needle in the haystack" solution. Challenges include:

  • Delayed Feedback: Agents receive rewards only upon completing a task, making it hard to assign credit.
  • Exploration Difficulty: Standard RL approaches struggle to find rare trajectories that lead to rewards without guidance.

Sparse Optimization Landscape

Our Approach: WANNs for Sparse RL

We explore a novel direction using Weight Agnostic Neural Networks (WANNs), which leverage evolutionary search to discover network architectures for solving such tasks.

We evaluate WANNs on modified versions of the MountainCar and LunarLander environments, where rewards are only given upon successful task completion. Our results demonstrate that WANNs can successfully learn compact, interpretable policies in these settings, whereas conventional RL methods fail without reward shaping!


πŸ“Š Results

Discovered WANN Network and Policy

Example solution: the best WANN model for the discrete Sparse Mountain Car (SMC) task learns an effective and interpretable policy:

WANN Network, Policy Visualization, MountainCar Task

Performance Comparison

Method SMC Discrete SMC Continuous Lunar Lander
WANN 123.92 136.73 1135.37
Q-Learning ∞ (110.53) ∞ (∞) ∞
PPO ∞ (133.27) ∞ (224.25) ∞
DQN ∞ (322.79) βœ— ∞

Average time steps to reach the goal;∞ denotes failure to reach the goal; values in parentheses indicate performance with reward shaping applied.


πŸ“Œ Key Takeaways

βœ… WANNs succeed where standard RL fails in sparse environments.
βœ… No reward shaping required, reducing manual effort.
βœ… Compact, interpretable networks discovered via evolutionary search.


πŸ“‚ Directory Structure

.
β”œβ”€β”€ wann_train.py        # Evolutionary search for networks solving the task
β”œβ”€β”€ wann_test.py         # Evaluation and visualization of trained WANNs
β”œβ”€β”€ visualizer.py        # Network structure and policy visualization tool
β”œβ”€β”€ pareto_front.py      # Displays the Pareto front (fitness vs. complexity)
β”œβ”€β”€ Sparse-RL-WANN.pdf   # Project report summary
β”‚
β”œβ”€β”€ wann_src/            # Helper functions for WANN EA process
β”œβ”€β”€ p/                   # JSON config files for experiment parameters
β”œβ”€β”€ domain/              # Task environments
β”œβ”€β”€ champions/           # Best evolved models stored here
β”œβ”€β”€ RL/                  # PPO, DQN, and Q-Learning implementations
β”‚
└── requirements.txt      # Required dependencies

πŸ›  Installation

Clone this repository and install dependencies:

git clone https://github.com/Tobi-Tob/Sparse-RL-Wann.git
cd Sparse-RL-Wann
pip install -r requirements.txt

πŸƒ Running Experiments

1️⃣ Train a WANN

Run evolutionary search to discover a weight-agnostic network for a given environment:

python wann_train.py

2️⃣ Test and Visualize WANN Policies

Evaluate a trained WANN model on an environment:

python wann_test.py

Visualize the discovered network structure and policy decisions:

python visualizer.py

This project builds on the original WANN framework from Google Brain Tokyo Workshop.