A tiny RL playground with minimal, hackable implementations of common RL algorithms.
algos/
: custom implementations of common RL algorthmsenvs/
: custom gym environments
python tinygym.py --algo [algo] --task [task] --max_evals [default=1000] --save [True]
Test on sample tasks: python unit_test.py
- reinforce (~35 lines of code)
- vpg (~50 lines)
- cma
- ppo (based on SB3)
- dqn
- sac
Converges to basic controls tasks in <1K episodes (CMA takes longer, ~10K).