Exploring Sparse Reward Environments with Weight Agnostic Neural Networks

🚀 Overview

Sparse reward environments present a significant challenge in reinforcement learning (RL), as agents receive little to no feedback for extended periods, making effective learning difficult. Traditional RL algorithms struggle in these settings without human-engineered feedback to guide training.

Why Sparse Rewards Matter?

Many real-world RL applications provide only sparse rewards, requiring the algorithm to find a "needle in the haystack" solution. Challenges include:

Delayed Feedback: Agents receive rewards only upon completing a task, making it hard to assign credit.
Exploration Difficulty: Standard RL approaches struggle to find rare trajectories that lead to rewards without guidance.

Our Approach: WANNs for Sparse RL

We explore a novel direction using Weight Agnostic Neural Networks (WANNs), which leverage evolutionary search to discover network architectures for solving such tasks.

We evaluate WANNs on modified versions of the MountainCar and LunarLander environments, where rewards are only given upon successful task completion. Our results demonstrate that WANNs can successfully learn compact, interpretable policies in these settings, whereas conventional RL methods fail without reward shaping!

📊 Results

Discovered WANN Network and Policy

Example solution: the best WANN model for the discrete Sparse Mountain Car (SMC) task learns an effective and interpretable policy:

Performance Comparison

Method	SMC Discrete	SMC Continuous	Lunar Lander
WANN	123.92	136.73	1135.37
Q-Learning	∞ (110.53)	∞ (∞)	∞
PPO	∞ (133.27)	∞ (224.25)	∞
DQN	∞ (322.79)	✗	∞

Average time steps to reach the goal;∞ denotes failure to reach the goal; values in parentheses indicate performance with reward shaping applied.

📌 Key Takeaways

✅ WANNs succeed where standard RL fails in sparse environments.
✅ No reward shaping required, reducing manual effort.
✅ Compact, interpretable networks discovered via evolutionary search.

📂 Directory Structure

.
├── wann_train.py        # Evolutionary search for networks solving the task
├── wann_test.py         # Evaluation and visualization of trained WANNs
├── visualizer.py        # Network structure and policy visualization tool
├── pareto_front.py      # Displays the Pareto front (fitness vs. complexity)
├── Sparse-RL-WANN.pdf   # Project report summary
│
├── wann_src/            # Helper functions for WANN EA process
├── p/                   # JSON config files for experiment parameters
├── domain/              # Task environments
├── champions/           # Best evolved models stored here
├── RL/                  # PPO, DQN, and Q-Learning implementations
│
└── requirements.txt      # Required dependencies

🛠 Installation

Clone this repository and install dependencies:

git clone https://github.com/Tobi-Tob/Sparse-RL-Wann.git
cd Sparse-RL-Wann
pip install -r requirements.txt

🏃 Running Experiments

1️⃣ Train a WANN

Run evolutionary search to discover a weight-agnostic network for a given environment:

python wann_train.py

2️⃣ Test and Visualize WANN Policies

Evaluate a trained WANN model on an environment:

python wann_test.py

Visualize the discovered network structure and policy decisions:

python visualizer.py

This project builds on the original WANN framework from Google Brain Tokyo Workshop.

Name	Name	Last commit message	Last commit date
Latest commit Tobi-Tob Update README.md Feb 13, 2025 92f2e69 · Feb 13, 2025 History 70 Commits
RL	RL	some changes	Jan 30, 2025
champions	champions	Added champions	Feb 13, 2025
domain	domain	Changed policy visualization	Feb 4, 2025
img	img	Upload of images	Feb 13, 2025
log	log	some changes	Jan 30, 2025
p	p	Added results.txt	Jan 28, 2025
wann_src	wann_src	Changed minor stuff	Jan 27, 2025
LICENSE	LICENSE	Initial commit of WANN code	Dec 29, 2024
README.md	README.md	Update README.md	Feb 13, 2025
RL_train_test.py	RL_train_test.py	some changes	Jan 30, 2025
Sparse-RL-WANN.pdf	Sparse-RL-WANN.pdf	Rename	Feb 13, 2025
pareto_front.py	pareto_front.py	some changes	Jan 30, 2025
requirements.txt	requirements.txt	Added requirements.txt	Feb 13, 2025
visualizer.py	visualizer.py	Added champions	Feb 13, 2025
wann_test.py	wann_test.py	Added champions	Feb 13, 2025
wann_train.py	wann_train.py	Improved visualization, added option alg_wDist == "one" to only use a…	Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Sparse Reward Environments with Weight Agnostic Neural Networks

🚀 Overview

Why Sparse Rewards Matter?

Our Approach: WANNs for Sparse RL

📊 Results

Discovered WANN Network and Policy

Performance Comparison

📌 Key Takeaways

📂 Directory Structure

🛠 Installation

🏃 Running Experiments

1️⃣ Train a WANN

2️⃣ Test and Visualize WANN Policies

About

Releases

Packages

Contributors 2

Languages

License

Tobi-Tob/Sparse-RL-Wann

Folders and files

Latest commit

History

Repository files navigation

Exploring Sparse Reward Environments with Weight Agnostic Neural Networks

🚀 Overview

Why Sparse Rewards Matter?

Our Approach: WANNs for Sparse RL

📊 Results

Discovered WANN Network and Policy

Performance Comparison

📌 Key Takeaways

📂 Directory Structure

🛠 Installation

🏃 Running Experiments

1️⃣ Train a WANN

2️⃣ Test and Visualize WANN Policies

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages