Welcome to EpiCare, a benchmarking library for reinforcement learning in healthcare applications. This ReadMe provides a guide to installing, configuring, and utilizing EpiCare.
To get started with EpiCare, please follow the installation steps below:
Create a new conda environment with python version 3.9+ using
conda create --name EpiCare python==python3.10
Ensure you have PyTorch installed on your system. Visit PyTorch's official installation guide for detailed instructions.
Clone the EpiCare repository to your local machine and navigate to the root directory. Then, install EpiCare along with its dependencies by running:
pip install -e .
EpiCare depends on the following Python packages:
- numpy
- h5py
- gym=0.23
- pyrallis
- tqdm
- wandb
- scipy
- pandas
These dependencies will be automatically installed during the EpiCare installation process.
The EpiCare environmet is an Gym environment which simulates longitudinal patient care. Full details about the environment can be found in our paper. To load the environment run:
from epicare.envs import EpiCare
env = EpiCare()
The environment is highly customizable, take a look at the paper for inspiration on what might be worth configuring for your own projects. See figures/figures.ipynb
(especially cell 2) for some examples of running and evaluating custom policies on this environment.
If you would like to use gym
to test any methods on this environment, simply run
import epicare.envs # This registers the environment automatically
import gym
gym.make("EpiCare-v0")
In addition to the EpiCare environment, we have included single-file implementations for five offline RL methods adapted to work for discrete control environments
- AWAC
- EDAC
- TD3+BC
- CQL
- IQL
- DQN
- DRN (Deep Reward Network)
- BC
Below we describe how to find the original training data from the paper as well as how to generate new/additional training data. All data contained or generated as beloew conforms to the D4RL standards and can be loaded for other RL pipelines using our data loader found in epicare.utils.load_custom_datset
.
We strongly recommend reporting results using the original training data for the sake of comparability.
The exact training data we used to generate our results can be found in ./data/
.
To generate new training data (which will not be identical to the original training data), navigate to the data
directory and execute:
python data_gen.py
This script prepares the data required for training the included offline RL models as well as any other models you may want to test.
EpiCare integrates with Weights & Biases (W&B) for hyperparameter optimization. Follow these steps to perform a hyperparameter sweep:
-
W&B Setup: If you haven't already, sign up or log in to your W&B account at wandb.ai. Use the
wandb login
command to authenticate your CLI. -
Creating a Sweep:
- Choose a configuration file corresponding to the RL model of your choice from the
algorithms/sweep_configs/hp_sweeps
directory. - Create a hyperparameter sweep project by running:
wandb sweep --project EpiCare [CONFIG FILE]
- This command initializes the project on W&B and outputs instructions for launching agent instances.
- Choose a configuration file corresponding to the RL model of your choice from the
-
Running the Sweep:
- Execute the provided command in one or multiple instances (across different machines if necessary) to start the hyperparameter optimization process.
- Monitor the progress and results on the W&B web dashboard.
-
Selecting Hyperparameters:
- You may choose the optimal hyperparameters manually from the dashboard or utilize the defaults found in our algorithms, which are based on comprehensive testing as detailed in the paper corresponding to this repository.
With the hyperparameters set, you can proceed to train the models. EpiCare's framework allows you to queue the training of multiple replicates across multiple environment seeds using a single command by way of another sweep process.
- For each configuration in
algorithms/sweep_configs/all_data_sweeps
, initiate a sweep with:wandb sweep --project EpiCare [CONFIG FILE]
- Follow the resulting instructions to launch agents to train your selected model.
- Model checkpoints are saved periodically throughout training runs in the
algorithms/checkpoints
directory
To customize hyperparameters beyond the defaults for a given model, update the configuration files.
EpiCare offers functionality for training under data-restricted scenarios. This feature focuses on the first environment seed and varies the number of training episodes (though in principle the experiment could be run for any environment seed of your choosing). Configuration files for the three best-performing models are located in algorithms/sweep_configs/data_restriction_sweeps
. Checkpoints for these runs are saved in the algorithms/checkpoints
directory.
To evaluate the final checkpoint of each run, simply run python algorithms/awac.py evaluate
, and the evaluation results will be output to results/awac_results.csv
.
These instructions work for all other algorithms provided simply by swapping out the algorithm name in the commands (i.e. subbing edac.py
instead of awac.py
).
In the figures
directory we have included a jupyter notebook with examples for how we analyzed our results and produced most of the figures found in our paper. After running all full-data and data restriction trials as described above, this notebook allows you to reproduce the main results of the paper.
The python packages required to run figures.ipynb
are included in requirements.txt
. To install, run pip install -f requirements.txt
while in the figures
directory.