Hearing the stories of people suffering from alzheimer is always heartbreaking. Our goal is to make machine learning beneficial for everybody.
a drug is biological or chemical entity that can modulate the course of a desease state by interacting with its target protein
- biological entity is composed of proteins or peptide. (e.g. antibodies)
- chemical entity is composed of compounds aka molecules. (e.g. small molecules)
Bioinformatics is a discipline entailing the useof computational approaches to analyze biological data.
- Analyze and compare genes, proteins and genomes.
- Exploring structures and functions of biomolecules (DNA, protein, lipid and carbohydrate).
- explore network biology and letabolic pathways.
streamlit-app-2022-03-31-21-03-69.veed.ioX2.mp4
in this subsection we will emphasise the role each directory plays in thos repository.
notebooks
contains jupyter notebooks for all the hooks a professional model should pass through, from data gathering, EDA, approving or denying hypothesis, model training and models comparison.data
contains data processed at each step.figures
contains saved figures that could be useful for scientists writing their report.toolkits
contains required files to run PaDEL descriptorsbioactivity-prediction-app
here resides the webapp, it summarizes all efforts done in other directories.
-
Prerequisites
Make sure you have:
-
Serializing the model
The size of the serialized model was relatively too large so you have to generate it by yourself.
Under
bioactivity-prediction-app
directory, open the following Jupyter Notebookbioactivity_prediction_model.ipynb
and run it with yourconda environment
.You will remark that a new file,
bioactivity_prediction_model.ipynb
, appears. -
Launching application
Under
bioactivity-prediction-app
directory, open thebash terminal
and execute the following commands.- create a virtual environment
python3 -m venv alzheimer-drug-discovery
- activate the virtual environment
source alzheimer-drug-discovery/bin/activate
- install the projects' dependencies
pip install -r requirements.txt
- start the application
streamlit run app.py
Enjoy interacting with your application on your favorite web browser 🎉
building domain knowledge is the most fragile part of every data science project. It took me hours of research to get the fundamentals of bioinformatics. I would rather write a wiki instead but you can deep dive into the references below 👇.
ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.
🙋 why are we calculating pubchem fingerprints even after calculating thelipinski descriptors ? what is the difference ?
👉 the lipinski descriptors will provide us with a set of a simple molecular descriptors that will provide us with the drug-like (oral drug) properties of the molecule (rule of five). in the other hand, thanks to the pubchem fingerprints, each molecule will be described by the unique building blocks, this will allow us to create a unique properties of the drug and that is the essence of drug discovery and drug design. the connectivity of the blocks are giving rise to the unique structure of the molecule and also the unique molecular properties. we have to find a way to rearrange the blocks in such a way that the molecule provides the most potency toward the target protein while also being safe and not so toxic.
- Alzheimer's disease - plaques, tangles, causes, symptoms & pathology
- What is Alzheimer's disease? - TED-Ed
- Efficacy of acetylcholinesterase inhibitors in Alzheimer's disease
- Treatment of dementia and Alzheimer's disease l Acetylcholinesterase inhibitors
- An Introduction to Computational Drug Discovery
- Computational Drug Discovery: Machine Learning for Making Sense of Big Data in Drug Discovery
- Probing the origins of human acetylcholinesterase inhibition via QSAR modeling and molecular docking
- Machine Learning for Drug Discovery (Explained in 2 minutes)
Huge thanks to Mr. Chanin Nantasenamat for making this project possible through his clear explanation whether at the algorithmic level or at the domain-knowledge level 🙏