Learn to Live Longer: Counterfactual Inference using Balanced Representations for Parametric Deep Survival Analysis
Following are the datasets used in the paper:
1. Synthetic - For generating synthetic dataset, we employed the synthetic dataset discussed in SurvITE.The ground truth time-to-event and time-to-censoring outcomes were generated using Restricted Mean Survival Time (RMST) as follows:
where individual specific and censoring survival functions are computed as
for both
i.e., ITE is computed over time epochs at which events have been reported up to
2. ACTG-Semi Synthetic - For generating ACTG Semi Synthetic Dataset, we used the ACTG discussed in CSA. The time-to-event is generated as
We divide the synthetic dataset with 50% of the data reserved for training and the rest for testing. Further 30% of training data is taken as validation set. For ACTG, we split the data into training, validation and test sets according to 70%, 15% , 15% partitions respectively.
To train the SurvCI model we performed hyper parameter tuning and used Adam optimizer in all the experiments. In all experiments we set the Scaling ELBO Censored Loss
Hyperparameters used in experiments pertaining SurvCI Model for different datasets
Datasets | K | Scaling IPM | Scaling SE | Scaling ELBO | Scaling L2 | Batch Size | Learning Rate |
---|---|---|---|---|---|---|---|
Synthetic,S1 | 3 | 0.001 | 0.1 | 1 | 0.5 | 200 | 3e-4 |
Synthetic,S2 | 3 | 0.001 | 0.1 | 1 | 0.5 | 200 | 3e-4 |
Synthetic,s3 | 3 | 0.5 | 2e-4 | 1 | 0.2 | 100 | 3e-4 |
Synthetic,S4 | 3 | 0.5 | 3e-4 | 1 | 0.2 | 100 | 3e-4 |
ACTG,S3 | 3 | 0.5 | 2e-4 | 1 | 0.2 | 1497 | 3e-4 |
ACTG,s4 | 3 | 0.5 | 2e-4 | 1 | 0.2 | 1497 | 3e-4 |
Hyperparameters used in experiments pertaining SurvCI-Info Model for different datasets
Datasets | K | Scaling IPM | Scaling SE | Scaling ELBO | Scaling L2 | Batch Size | Learning Rate |
---|---|---|---|---|---|---|---|
Synthetic,S2 | 3 | 10 | 3e-5 | 0.6 | 0.2 | 200 | 3e-4 |
Synthetic,S4 | 3 | 10 | 3e-5 | 0.6 | 0.2 | 100 | 3e-4 |
ACTG,s4 | 6 | 1 | 1e-6 | 0.5 | 0.2 | 64 | 3e-5 |
**NOTE** For running experiments for S1 and S2 setting we force few samples as treated samples in each batch during training to avoid Runtime error