World Models A3C

Implementation of a variant of World Models

Note

Replaced MDN-RNN to LSTM for Memory
Replaced CMA-ES to A3C for Controller
Trained over two stages
- Stage 1: V and M were trained on dataset with random rollout
- Stage 2: V and M were trained on dataset with a3c rollout

Training Result

Result with dataset using random rollout

Result with dataset using the pretrained model rollout

Play Demo

Environment Setting

apt-get update
apt-get install swig
pip install gym[box2d]

Training Stage I

Dataset Generation using Rollout with random policy

python rollout.py

Vision model with VAE

python train-vae.py

Memory model with LSTM-RNN

python train-rnn.py

Controller with A3C

python train-a3c.py

Training Stage II

Rollout with the pretrained model

python rollout-a3c.py

Fine-tuning V and M with new dataset

vi hparams.py
    extra = True

python train-vae.py
python train-rnn.py

Train new C with the improved V and M

python train-a3c.py

Test

# <# of plays> <seed> <is_record>
python test.py 2 999 False

Reference

https://arxiv.org/abs/1803.10122
https://github.com/ctallec/world-models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

World Models A3C

Implementation of a variant of World Models

Note

Training Result

Environment Setting

Training Stage I

Dataset Generation using Rollout with random policy

Vision model with VAE

Memory model with LSTM-RNN

Controller with A3C

Training Stage II

Rollout with the pretrained model

Fine-tuning V and M with new dataset

Train new C with the improved V and M

Test

Reference

Files

README.md

Latest commit

History

README.md

File metadata and controls

World Models A3C

Implementation of a variant of World Models

Note

Training Result

Environment Setting

Training Stage I

Dataset Generation using Rollout with random policy

Vision model with VAE

Memory model with LSTM-RNN

Controller with A3C

Training Stage II

Rollout with the pretrained model

Fine-tuning V and M with new dataset

Train new C with the improved V and M

Test

Reference