This is a repository containing code and data for the paper:
N. Corvelo Benz and M. Gomez-Rodriguez. Counterfactual Inference of Second Opinions. UAI, 2022.
The paper is available here.
Install required packages with pip install -r requirements.txt
siscm.py
is the SI-SCM class, e.g., implements:- class constructor using marginal distribution functions
fit
finds expert partition into groups of mutually similar expertspredict
,cf_predict
functions to (counterfactually) predict experts' labels
PCS_graph.py
graph class encoding (pairwise) counterfactual stability, implements:- checking for violations of counterfactual stability given data, i.e., checking for dissimilarity between experts
- greedy clique partitioning algorithm
get_features_vgg19.py
generates features for CIFAR-10 test images using VGG19join_feat_cifar10h_labels.py
generates and saves a single dataframe with image features from VGG19 and corresponding labels from CIFAR-10hpreprocess_data.py
resampling of experts and data, train-test-split of data
synthetic_experiment.py
implements the synthetic experimentreal_experiment.py
implements experiment on real data./data
contains preprocessed real datasets, train and test data./features
contains features generated by VGG19 for CIFAR-10 test images
./results_synthetic
contains result files from experiment on synthetic data./results_real
contains result files from experiment on real dataevaluation_synthetic.py
generates plots for given experiment results on synthetic dataevaluation_real.py
generates plots for given experiment results on real datahelper.py
contains helper functions for plotting
- Run
synthetic_experiment.py
- Run
evaluation_synthetic.py
to generate the plots from the evaluation results - All experimental results and plots will be stored in directory
./results_synthetic
Download CIFAR-10H dataset into the directory ./data
from https://github.com/jcpeterson/cifar-10h
- Run
get_features_vgg19.py
to generate the features of the data with VGG19 - Run
join_feat_cifar10h_labels.py
to join the features and human label predictions of data set CIFAR-10H in one dataframe - Run
preprocessed_data.py
to resample the data and experts to obtain a higher disagreement ratio - The training and test set are stored in directory
./data
, features and labels are stored in separate matrices
- Run
real_experiment.py
- Run
evaluation_real.py
to generate the plots from the evaluation results - All experimental results and plots will be stored in directory
./results_real
If you use parts of the code in this repository for your own research, please consider citing:
@inproceedings{benz2022counterfactual,
title={Counterfactual Inference of Second Opinions},
author={Corvelo Benz, Nina and Gomez-Rodriguez, Manuel},
booktitle={UAI},
year={2022}
}