Skip to content

CMDM-Lab/CYP450

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comprehensively-Curated Dataset of CYP450 Interactions: Enhancing Predictive Models for Drug Metabolism

Description

Here are source codes for demonstrating machine learning and deep learning using this dataset.

  • DT.py is for Decision Tree.
  • GCN_hyperopt_CV.py is for Graph Convolution Network (GCN)
  • RF.py is for Random Forest.
  • SVC.py is for Support Vector Machine.
  • SVR.py is for Support Vector Regression.

Authors

Yu-Hao Ni1, Yu-Wen Su1, Shaung-Chen Yang2, Jia-Cheng Hong1, Tien-Chueh Kuo1,5, and Yufeng Jane Tseng1,3,4,5,*

1Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 10617,Taiwan

2School of Medicine, National Taiwan University, Taipei 10051, Taiwan

3Department of Computer Science and Information Engineering, National Taiwan University, Taipei 10617,Taiwan

4School of Pharmacy,College of Medicine, National Taiwan University, Taipei 10002, Taiwan

5The Metabolomics Core Laboratory, Centers of Genomic and Precision Medicine, National Taiwan University,Taipei 10617, Taiwan

*corresponding author(s): Yufeng Jane Tseng (yjtseng@csie.ntu.edu.tw)

Abstract

We collected and organized a detailed dataset encompassing both substrates and non-substrates for six principal cytochrome P450 (CYP450) isozymes, responsible for 90% of Phase I drug metabolism in humans. These isozymes, specifically CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, play critical roles in the detoxification and metabolic processing of therapeutic compounds. The dataset, meticulously assembled, includes interactions with approximately 2000 compounds per enzyme, ensuring comprehensive coverage and high accuracy. Employing a combination of conventional machine learning techniques alongside advanced methodologies such as Graph Convolutional Networks (GCN), robust models have been developed to elucidate these drug-enzyme interactions. The dataset is poised to significantly contribute to fields requiring pharmacokinetic modeling, furthering drug development efforts and toxicological studies by providing an essential resource for the accurate prediction of metabolic pathways, thereby enhancing drug safety and efficacy assessments.

dataset repository

DOI: 10.6084/m9.figshare.26630515