- Replaced MDN-RNN to LSTM for Memory
- Replaced CMA-ES to A3C for Controller
- Trained over two stages
- Stage 1: V and M were trained on dataset with random rollout
- Stage 2: V and M were trained on dataset with a3c rollout
Result with dataset using random rollout
Result with dataset using the pretrained model rollout
Play Demo
apt-get update
apt-get install swig
pip install gym[box2d]
python rollout.py
python train-vae.py
python train-rnn.py
python train-a3c.py
python rollout-a3c.py
vi hparams.py
extra = True
python train-vae.py
python train-rnn.py
python train-a3c.py
# <# of plays> <seed> <is_record>
python test.py 2 999 False