Skip to content

Fusion strategies to build multimodal predictors for the prediction of immunotherapy response in non-small cell lung cancer

License

Notifications You must be signed in to change notification settings

sysbio-curie/multipit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

multipit

License: MIT Code style: black

This repository provides a set of Python tools to perform multimodal learning with tabular data. It contains the code used in our study:

Captier, N., Lerousseau, M., Orlhac, F. et al. Integration of clinical, pathological, radiological, and transcriptomic data improves prediction for first-line immunotherapy outcome in metastatic non-small cell lung cancer. Nat Commun 16, 614 (2025).

Installation

Dependencies

  • lifelines (>= 0.27.4)
  • matplotlib (>= 3.5.1)
  • numpy (>= 1.21.5)
  • pandas (= 1.5.3)
  • pyyaml (>= 6.0)
  • scikit-learn (>= 1.2.0)
  • scikit-survival (>= 0.21.0)
  • seaborn (=0.13.0)
  • shap (>= 0.41.0)
  • xgboost (>= 1.7.5)

Install from source

Clone the repository:

git clone https://github.com/sysbio-curie/multipit

Key features

Deep-multipit

We also provide another Github repository, named deep-multipit with a Pytorch implementation of an end-to-end integration strategy with attention weights, inspired by Vanguri et al, 2022.

Run scripts

Modify the configurations in .yaml config files (in config/ subfolder) then run the following command in your terminal:

python latefusion.py -c config/config_latefusion.yaml -s path/to/results/folder
python collect_shap_survival.py -c config/config_latefusion_survival.yaml -s path/to/results/folder

Warning: For Windows OS paths must be written with '\' or '\' separators (instead of '/').

Note: In order to modify more deeply the loading of the data or the predictive pipelines, please update the PredictionTask class in the file _init_scripts.py.

Examples

In the examples folder we provide a brief example on how to slightly modify the scripts and codes from our original experiments to perform multimodal learning for the prediction of Overall Survival from clinical and RNA-seq data extracted from TCGA (i.e., stage III and IV TCGA-LUAD and TCGA-LUSC samples).

We simply updated the PredictionTask class in a new file _init_scripts_tcga.py to load TGCA data and build predictive pipelines.

Note: clinical and transcriptomic data extracted for 201 stage III/IV TCGA patients (i.e., LUAD or LUSC) are available in the data folder.

Citing multipit

If you use multipit in a scientific publication, we would appreciate citation to the following paper:

Captier, N., Lerousseau, M., Orlhac, F. et al. Integration of clinical, pathological, radiological, and transcriptomic data improves prediction for first-line immunotherapy outcome in metastatic non-small cell lung cancer. Nat Commun 16, 614 (2025). https://doi.org/10.1038/s41467-025-55847-5

Acknowledgements

This repository was created as part of the PhD project of Nicolas Captier in the Computational Systems Biology of Cancer group and the Laboratory of Translational Imaging in Oncology (LITO) of Institut Curie.

About

Fusion strategies to build multimodal predictors for the prediction of immunotherapy response in non-small cell lung cancer

Resources

License

Stars

Watchers

Forks

Packages

No packages published