GitHub - tirthankar95/NumberLearningInChildren_RL_NLP

The Reinforcement Learning Environment

Policy type instruction [DEMO]

Software Architecture

During Training

During Testing

How to run the code?

Git clone the project https://github.com/tirthankar95/NumberLearningInChildren_RL_NLP.git and go inside one directory. Check if you have permission to clone the repository.
Download glove embeddings wget http://nlp.stanford.edu/data/glove.6B.zip and unzip it unzip glove*.zip.
Pull the environment(docker image) for running the code from docker-hub using. sudo docker pull tirthankar95/rl-nlp:latest.
Go inside the container sudo docker run -it -v $(pwd):/NumberLearningInChildren_ML tirthankar95/rl-nlp /bin/bash
Run from the main directory ~ python3 -W ignore RL_Algorithm/PPO/ppo.py &> Results/console.log &. This is used to train the model.
Run from the main directory ~ python3 -W ignore RL_Algorithm/PPO/ppo_post_run.py &. This is used to generate results after the model has been trained.

Analysis

You can generate your own code for custom analysis. To reproduce our results copy Results/final_score_* and Results/train_model_* to Analysis folder.
Append {a} to train_model_*.json{a} in Analysis folder.

Where,
{a} -> {"", "seed_0", "seed_1", ... } depending on whether it's the {1st, 2nd, 3rd, ...} run of the same configuration.
Run python3 Plot_redx.py inside the container.
To create consolidated graphs modify train_config.json and test_config.json, by trying out all the combinations below.
model $\epsilon$ {0, 1, 2, 3}
instr_type $\epsilon$ {0, 1}
And then do step 1 & 2, after dumping all the necessary files in the Analysis folder do step 3. This will give consolidated graphs as depicted in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Analysis		Analysis
Configs		Configs
Docker_Env_Files		Docker_Env_Files
NN_Model		NN_Model
RL_Algorithm/PPO		RL_Algorithm/PPO
RL_Environment		RL_Environment
Readme_Images		Readme_Images
Results		Results
README.md		README.md
check_env.ipynb		check_env.ipynb

Provide feedback