Skip to content

Master of Science in Computer Science - master project in XAI

Notifications You must be signed in to change notification settings

erikssommer/xai-concept-methods-drl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XAI Concept Methods in DRL

Codebase for master's thesis project in computer science specializing in artificial intelligence at the Norwegian University of Science and Technology (NTNU).

Description

This project aims to discover the aquisition of static and dynamic concepts in the policy of an agent trained using deep reinforcement learning. The project is based on the AlphaGo Zero algorithm by Google DeepMind, and the agent is trained in the game of Go. The project uses Concept Activation Vectors (CAVs) to find static and dynamic concepts in the agent's policy. The project also uses the Monte Carlo Tree Search (MCTS) algorithm to unsupervisedly generate datasets for dynamic concepts. The project uses a joint embedding model to learn the relationship between state-action pairs and conceptual explanations. The project uses the joint embedding model and concept functions to improve the reward function of the agent. The project also trains a concept bottleneck model to learn concepts in the agent's policy.

The codebase contains:

  • the deep reinforcmemt training loop, similar to the one outlined in the AlphaGo Zero paper by Google DeepMind
  • concept detection using CAVs to find static and dynamic concepts in the agent's policy
  • concept functions for static concepts
  • algorithm using MCTS to unsupervisedly generate datasets for dynamic concepts
  • joint embedding model to learn the relationship between state-action pairs and conceptual explanations
  • using the joint embedding model and concept functions to improve the reward function of the agent
  • Training a concept bottleneck model to learn concepts in the agent's policy

Install required packages

python -m pip install -r requirements.txt

Set the environment variables in the config file

nano config.py

Train models single threaded

python train_single_thread.py

Train models multi threaded (MPI)

mpirun -np 4 python train_hpc.py

Play against the trained models

python play.py

Topp - tournament of progressive policies

python tournament.py

Run the tests

python test_name.py

Run training on HPC (Idun at NTNU is used in this project)

sbatch hpc.sh

Tensorboard - visualize the training

tensorboard --logdir tensorboard_logs/

Results from the experiments

The results from the experiments are located in the notebooks folder. The notebooks are named according to the experiments they represent.

About

Master of Science in Computer Science - master project in XAI

Resources

Stars

Watchers

Forks