Skip to content

paischer101/OfflineRLAutonomous

Repository files navigation

Reinforcement Learning for Real-World Autonomous Scenarios

First in order to be able to run our code you must install and activate the provided environment with

conda env create -f envirotnment.yaml
conda activate rlautonomous

Note that this environment was created and used in Ubuntu, thus if you use Windows you can create a new environment by

conda create -n rlautonomous python=3.6
conda activate rlautonomous

and install the required dependencies listed in the environment.yaml file. Further you need to download the environment and the demonstration data of our field study from this link. There you can also find our Unity3D scenario and configuration files to run our online RL experiments as mentioned in the paper, as well as a binary for UnityHub to install Unity3D on Linux. For running our online RL variants we kindly refer to the unity mlagents package. Note that you will need to change the path to the demonstration data within the configuration files.

After installing and activating the environment you can reproduce our results for Behavioral Cloning by issuing the following command

python BC.py dqn_cloning --n-steps 15000 --n-runs 5
python BC.py drqn_cloning --n-steps 15000 --n-runs 5

The first command will run behavrioal cloning and the second one its recurrent version. Results will be written into a newly created results folder in the current directory, within which you can find a checkpoint of the model and tensorboard summaries which contain all the metrics. Further you can look at some visualizations of the model we prepared, including a sample trajectory, learned action distributions of the model for this trajectory, plus a tsne embedding of the representation the model has learned by

python visualize_model.py <MODELFILE> --checkpoint <PATH TO CHECKPOINT>

For visualizing our behavioral cloning models, simply use dqn_cloning or drqn_cloning as modelfile and supply the path to the checkpoint. This will create a new directory visualizations within the directory containing the checkpoint. In the visualizations folder you can find a tensorboard eventfile in which all visualizations can be observed. The TSNE projection is stored separately as an image tsne_proj.png within the visualizations folder. So far the hyperparameters used for our work are all hardcoded within the code, but can be adjusted manually.

We also included a recently published method for offline reinforcement learnning to our framework which can be trained by

python CQ_SAC.py dqn --n-steps 15000 --n-runs 5

Note that this model up to now only works on single transitions and not as a recurrent variant. Also the hyperparameters for this model are not tuned yet, thus it does not yield reasonable results. However we expect this model to outperform behavioral cloning after hyperparameter tuning, but might also require collection of more human data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages