A method to attribute model performance changes to distribution shifts in causal mechanisms. For more details please see our ICML 2023 paper.
Our package is available on PyPI. Simply run the following with Python >= 3.7:
pip install expl_perf_drop
We provide the following examples as Jupyter Notebooks:
- Spurious Synthetic Example
- More to come!
If reproducing the experiments in the paper, we recommend creating a separate Conda environment:
git clone https://github.com/MLforHealth/expl_perf_drop
cd expl_perf_drop/
conda env create -f environment.yml
conda activate expl_perf_drop
To reproduce the experiments in the paper which involve training grids of models and then generating explanations for them, use sweep.py
as follows:
python -m expl_perf_drop.sweep launch \
--experiment {experiment_name} \
--output_dir {output_root} \
--command_launcher {launcher}
where:
experiment_name
corresponds to experiments defined as classes inexperiments.py
output_root
is a directory where experimental results will be stored.launcher
is a string corresponding to a launcher defined in scripts/launchers.py (i.e.slurm
orlocal
).
The train_model
experiment should be ran first. The remaining experiments can be ran in any order.
Alternatively, a single explanation can also be generated by calling explain.py
with the appropriate arguments.
The CausalGAN portion of our CelebA experiment is heavily based on an experiment in the parametric-robustness-evaluation codebase. Our Shapley Value estimation functions are taken from the DoWhy package.
If you use this code or package in your research, please cite the following publication:
@inproceedings{zhang2023did,
title={"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts},
author={Zhang, Haoran and Singh, Harvineet and Ghassemi, Marzyeh and Joshi, Shalmali},
booktitle={International Conference on Machine Learning},
year={2023}
}