Skip to content

MrPhil/RareDDIE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting rare drug-drug interaction events with dual-granular structure-adaptive and pair variational representation

This is a meta-learning-based DDIEs predictor. Before the article is published, this project only contains all data, and the reproduction code is shown in the submission document.

System requirements

Installation Tested on Ubuntu 16.04, CentOS 7, windos 10 with Python 3.7 on one NVIDIA RTX 4080Ti GPU.

Installation

After downloading the code and data, execute the following command to install all dependencies. This may take some time.

pip install -r requirements.txt

Quick Demo & Instructions for use

The repositories for Independent, RareDDIE and ZetaDDIE provide code for reproducing our results.

Result Reproduction

  • Run tester_struc_drugbank.py and tester_struc_mdf.py to reproduce the reported results.

Model Training

  • Run trainer_structure_acc_fp_neigh_VAE_GAN_struc.py to train the model on the standard dataset1.
  • Run trainer_structure_acc_fp_neigh(dataset2)_VAE_GAN_struc.py to train the model on the standard dataset2.

Preprocessing personal data sets

Users can preprocess their own datasets for use with our models. A step-by-step example is provided in the toy example directory.

Data Preparation

Users should first prepare their dataset, including interaction event data, drug data, and SMILES representations of drugs. The expected formats follow those in toy.data, druglist.csv, and drug_smiles.csv.

Pipeline Execution

  1. Generate interaction event input files

    • Run

      1construct_task.py
      

      to generate event input files:

      • train_tasks.json: Common events for training
      • dev_tasks.json: Common events for validation
      • test_tasks.json: Fewer events for testing
      • test2_tasks.json: Rare events for testing
  2. Generate DDIE relationship input files

    • Run

      2data_(get_e1rel_e2_and_rel2candidates).py
      

      to produce:

      • e1rel_e2.json: Drug-drug interaction event relationships
      • rel2candidates.json: Relationship of candidates
  3. Integrate the background graph

    • Replace the default path_graph with a prebuilt background graph.
    • Add dti_entity.csv and dti_rel.csv to define entities and relationships in the graph.
    • Run 3add_entity_and_rel.py to incorporate these into the dataset.
  4. Generate drug feature representations

    • Copy the prepared SMILES file to fp/data/.
    • Run save_features.py to generate the feature file morgan_toy_dataset.npz in the features directory.

Training

A training example for RareDDIE is provided in the toy example directory.

Run

python trainer_structure_acc_fp_neigh_VAE_struc.py --dataset toy_dataset --few 10 --train_few 10 --batch_size 256
  • If users have pre-trained background graph embeddings (e.g., DRKG_TransE_entity.npy and DRKG_TransE_relation.npy), they should construct ent2embids and relation2embids files to map all dataset and background entities/relations to feature indices.
  • For entities or relations without pretrained features, set the corresponding index to -1.

Run

python trainer_structure_acc_fp_neigh_VAE_struc.py --dataset toy_dataset --few 10 --train_few 10 --batch_size 256 --random_embed False

To run ZetaDDIE, simply replace the preprocessed dataset directory with the appropriate data.

Test

A test example for RareDDIE is provided in the toy example directory.

Run

python tester_struc_dataset.py

Independent dataset testing

Users can also leverage a model trained on their own standard dataset to directly predict on an independent dataset, enabling cross-domain prediction. We provide code for reproducing our results and user own dataset.

Independent Dataset Prediction

  1. Prepare the independent dataset following the same preprocessing steps described earlier. Copy the processed dataset into the Independent directory (e.g., twoside).
  2. Copy the trained standard dataset and trained model into the Independent directory (e.g., dataset1 and models).
  3. Execute the prediction script to evaluate the model’s cross-domain performance.

Run

python tester_cross_domain.py

Reference

All datasets are processed from these works[1-5] and databases[6-7].

  1. Lin S, et al. MDF-SA-DDI: predicting drug–drug interaction events based on multi-source drug fusion, multi-source feature fusion and transformer self-attention mechanism. Brief Bioinform 23, bbab421 (2022).
  2. Nyamabo AK, Yu H, Liu Z, Shi J-Y. Drug–drug interaction prediction with learnable size-adaptive molecular substructures. Brief Bioinform 23, bbab441 (2022).
  3. Preuer K, Lewis RP, Hochreiter S, Bender A, Bulusu KC, Klambauer G. DeepSynergy: predicting anti-cancer drug synergy with Deep Learning. Bioinformatics 34, 1538-1546 (2018).
  4. Nair NU, et al. A landscape of response to drug combinations in non-small cell lung cancer. Nature Communications 14, 3830 (2023).
  5. Ma T, Lin X, Song B, Philip SY, Zeng X. Kg-mtl: knowledge graph enhanced multi-task learning for molecular interaction. IEEE Transactions on Knowledge and Data Engineering. 35, 7068-7081 (2022).
  6. Wishart DS, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. NAR 46, D1074-D1082 (2018).
  7. Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Transl Med 4, 125ra131-125ra131 (2012).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages