Skip to content

ibrahimberb/Predator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Predator

Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions


Predator is a computational tool that offers both the prediction of mutation effects on protein-protein interactions by classifying them into disrupting and nondisrupting and provides a comprehensive analysis on candidate cancer associated genes, their most frequently disrupted interaction partners, cancer patients and several cancer cohorts in TCGA project.

For more information, please refer to the article which can be found at here.


ProtTrans Attention Visualization


Reproducibility

Below are the steps to obtain the results in the paper.

Preparing the Predator environment

  1. Download the repository and move to the reproducible folder.
    cd \Predator\src\reproducible

  2. Update the conda base
    conda update conda -n base -y

  3. conda-forge needs to be added for installations of packages. conda config --append channels conda-forge

  4. Create a new environment named predator with a specified Python version and install required packages.
    conda create python=3.8.13 --name predator --file requirements.txt -y

  5. Activate new environment.
    conda activate predator

  6. Adding ipykernel to this new environment named Predator. python -m ipykernel install --user --name predator --display-name "Predator"

Training the Predator

The trained model can be found in here. In order to train from scratch, please execute the following command:

python reproducable_01_training_predator.py

Newly trained model will be extracted in src\PredatorModels\PredatorModel_<date>\<hash> directory. Additionally, a new executed Jupyter notebook Reproduced_PredatorStudyModel.ipynb will be created in reproducible folder.

Predictions on the TCGA Mutation Datasets

Trained Predator model can be applied to TCGA mutation datasets with reproducable_01_training_predator.py. The script also allows the selection of the model to be used in the prediction task. Simply run the following command to execute:

python reproducable_02_predicting_tcga.py

This will export prediction files in predictions_datasets folder and create Reproduced_PredatorStudy_<TCGA>.ipynb for each TCGA cohort.

Patient Interaction Analysis

If the path of prediction files are not to be updated, the patient interaction analysis files are generated by the following command:

python reproducable_03_patient_interaction_analysis.py

The paths of newer prediction datasets can be updated in the script before running it. Upon completion, Excel files containing interactions and patients for each TCGA should appear in data/patient_interaction_datasets folder. Also, Reproduced_Disruptive_patients_per_patient.ipynb will be created in executed form.

Analysis

Lastly, update the path if necessary in reproducible_04_analysis.py and run it with the command below:

python reproducible_04_analysis.py

Execution of this command will create counts file for each TCGA.

Run the first part of the notebook tables/preliminary_tables_counts.ipynb as indicated.

Run the notebook analyses/PatientInteractionAnalysis/PatientInteractionAnalysis.ipynb.

Continue with the second part of the tables/preliminary_tables_counts.ipynb, which the Gene Level Statistics table will be generated.

Citation

If you find Predator useful for your research, please consider citing the following paper:

Berber, Ibrahim, Cesim Erten, and Hilal Kazan. "Predator: Predicting the impact of cancer somatic mutations on protein-protein interactions." IEEE/ACM Transactions on Computational Biology and Bioinformatics (2023).

@article{berber2023predator,
  title={Predator: Predicting the impact of cancer somatic mutations on protein-protein interactions},
  author={Berber, Ibrahim and Erten, Cesim and Kazan, Hilal},
  journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  year={2023},
  volume={20},
  number={5},
  pages={3163-3172},
  publisher={IEEE},
  doi={10.1109/TCBB.2023.3262119},
}

About

The official repository of PREDATOR: Predicting the impact of cancer somatic mutations on protein-protein interactions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published