Skip to content

LigEGFR: Spatial graph embedding and molecular descriptors assisted bioactivity prediction of ligand molecules for epidermal growth factor receptor on a cell line-based dataset

Notifications You must be signed in to change notification settings

scads-biochem/LigEGFR

Repository files navigation

LigEGFR

LigEGFR: Spatial graph embedding and molecular descriptors assisted bioactivity prediction of ligand molecules for epidermal growth factor receptor on a cell line-based dataset

Puri Virakarin1,#, Natthakan Saengnil1,#, Bundit Boonyarit2,#, Jiramet Kinchagawat2, Rattasat Laotaew2, Treephop Saeteng2, Thanasan Nilsu1, Naravut Suvannang2,*, Thanyada Rungrotmongkol3,4,*, and Sarana Nutanong2,*

1 Kamnoetvidya Science Academy (KVIS), Rayong 21210, Thailand
2 School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand
3 Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
4 Biocatalyst and Environmental Biotechnology Research Unit, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand

# These authors contributed equally to this work.
* Corresponding author

About

LigEGFR is a novel deep learning architecture for predicting pIC50 of small molecules against human epidermal growth factor receptor (EGFR) tyrosine kinase. The architecture was inspired and adapted from a convolution spatial graph embedding layer (C-SGEL) which was constructed by graph convolutional networks incorporating especial molecular descriptors. This model outperformed baseline machine learning models for predicting pIC50 and was notable for higher performance in hit compound classification, compared to molecular docking and machine learning approaches. Moreover, our work is the first model that employed a large-scale and non-redundant dataset to enhance the diversity of the small molecules.

We developed a user-friendly online platform with compatibility for all devices and Python executable script to predict pIC50 and classify the hit compounds. This approach opens a new way for applying the hit and lead compounds discovery via targeted lung cancer therapy, offering a powerful strategy that potentially helps researchers overcome major challenges in drug discovery and development processes, and avoids pitfalls of conventional computation methods.

Herein, we provide LigEGFR web service at https://ligegfr.vistec.ist/, and a Python executable file based-on Anaconda (recommended for Linux and macOS) and Docker (recommended for Windows) installations.

For C-SGEL, this layer is a part of the convolution spatial graph embedding network (C-SGEN) architecture. For more information, please visit https://doi.org/10.1021/acs.jcim.9b00410

Citation

Please cite our paper by:

@article {Virakarin2020.12.24.423424,
	author = {Virakarin, Puri and Saengnil, Natthakan and Boonyarit, Bundit and Kinchagawat, Jiramet and Laotaew, Rattasat and Saeteng, Treephop and Nilsu, Thanasan and Suvannang, Naravut and Rungrotmongkol, Thanyada and Nutanong, Sarana},
	title = {LigEGFR: Spatial graph embedding and molecular descriptors assisted bioactivity prediction of ligand molecules for epidermal growth factor receptor on a cell line-based dataset},
	elocation-id = {2020.12.24.423424},
	year = {2020},
	doi = {10.1101/2020.12.24.423424},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2020/12/24/2020.12.24.423424.1},
	eprint = {https://www.biorxiv.org/content/early/2020/12/24/2020.12.24.423424.1.full.pdf},
	journal = {bioRxiv}
}

Contact

Bundit Boonyarit

Scalable Data Systems Lab (SCADS)
School of Information Science and Technology

Vidyasirimedhi Institute of Science and Technology (VISTEC)
Wangchan Valley, 555 Moo 1, Payupnai, Wangchan, Rayong 21210, Thailand

Email: bundit.b_s18@vistec.ac.th

About

LigEGFR: Spatial graph embedding and molecular descriptors assisted bioactivity prediction of ligand molecules for epidermal growth factor receptor on a cell line-based dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published