Predator

Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions

Predator is a computational tool that offers both the prediction of mutation effects on protein-protein interactions by classifying them into disrupting and nondisrupting and provides a comprehensive analysis on candidate cancer associated genes, their most frequently disrupted interaction partners, cancer patients and several cancer cohorts in TCGA project.

For more information, please refer to the article which can be found at here.

Reproducibility

Below are the steps to obtain the results in the paper.

Preparing the Predator environment

Download the repository and move to the reproducible folder.
cd \Predator\src\reproducible
Update the conda base
conda update conda -n base -y
conda-forge needs to be added for installations of packages. conda config --append channels conda-forge
Create a new environment named predator with a specified Python version and install required packages.
conda create python=3.8.13 --name predator --file requirements.txt -y
Activate new environment.
conda activate predator
Adding ipykernel to this new environment named Predator. python -m ipykernel install --user --name predator --display-name "Predator"

Training the Predator

The trained model can be found in here. In order to train from scratch, please execute the following command:

python reproducable_01_training_predator.py

Newly trained model will be extracted in src\PredatorModels\PredatorModel_<date>\<hash> directory. Additionally, a new executed Jupyter notebook Reproduced_PredatorStudyModel.ipynb will be created in reproducible folder.

Predictions on the TCGA Mutation Datasets

Trained Predator model can be applied to TCGA mutation datasets with reproducable_01_training_predator.py. The script also allows the selection of the model to be used in the prediction task. Simply run the following command to execute:

python reproducable_02_predicting_tcga.py

This will export prediction files in predictions_datasets folder and create Reproduced_PredatorStudy_<TCGA>.ipynb for each TCGA cohort.

Patient Interaction Analysis

If the path of prediction files are not to be updated, the patient interaction analysis files are generated by the following command:

python reproducable_03_patient_interaction_analysis.py

The paths of newer prediction datasets can be updated in the script before running it. Upon completion, Excel files containing interactions and patients for each TCGA should appear in data/patient_interaction_datasets folder. Also, Reproduced_Disruptive_patients_per_patient.ipynb will be created in executed form.

Analysis

Lastly, update the path if necessary in reproducible_04_analysis.py and run it with the command below:

python reproducible_04_analysis.py

Execution of this command will create counts file for each TCGA.

Run the first part of the notebook tables/preliminary_tables_counts.ipynb as indicated.

Run the notebook analyses/PatientInteractionAnalysis/PatientInteractionAnalysis.ipynb.

Continue with the second part of the tables/preliminary_tables_counts.ipynb, which the Gene Level Statistics table will be generated.

Citation

If you find Predator useful for your research, please consider citing the following paper:

Berber, Ibrahim, Cesim Erten, and Hilal Kazan. "Predator: Predicting the impact of cancer somatic mutations on protein-protein interactions." IEEE/ACM Transactions on Computational Biology and Bioinformatics (2023).

@article{berber2023predator,
  title={Predator: Predicting the impact of cancer somatic mutations on protein-protein interactions},
  author={Berber, Ibrahim and Erten, Cesim and Kazan, Hilal},
  journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  year={2023},
  volume={20},
  number={5},
  pages={3163-3172},
  publisher={IEEE},
  doi={10.1109/TCBB.2023.3262119},
}

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.idea		.idea
data		data
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predator

Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions

Reproducibility

Preparing the Predator environment

Training the Predator

Predictions on the TCGA Mutation Datasets

Patient Interaction Analysis

Analysis

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ibrahimberb/Predator

Folders and files

Latest commit

History

Repository files navigation

Predator

Predicting the Impact of Cancer Somatic Mutations on Protein-Protein Interactions

Reproducibility

Preparing the Predator environment

Training the Predator

Predictions on the TCGA Mutation Datasets

Patient Interaction Analysis

Analysis

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages