Project in Machine Learning (CS-433)
EPFL, 2019
We employ unsupervised machine learning techniques to cluster subtypes of estrogen receptor-positive breast cancer, which is the most common variant worldwide. Clustering is done according to hormone responses obtained from in vivo models of patient-derived xenografts. Our results facilitate more targeted treatment of patients, responding to the urgent need for personalized medicine to treat breast cancer.
results/
directory is for output and plotsscripts/
directory contains all the codeconstants.py
includes constants used throughout the projecthelpers.py
includes various helper functionsload.py
includes functions for loading and manipulating the dataplots.py
includes plotting functionsclustering_helpers.py
includes helper functions used for the cluster analysisdata_analysis.ipynb
is the Jupyter Notebook file that includes the exploratory data analysis and visualizationscluster_analysis.ipynb
is the Jupyter Notebook file that includes the cluster analysis
- Data handling and ML libraries:
- Plotting libraries:
- Clone or fork the repository
- Download the data and add the
data/
folder to the root of the project - Install Jupyter Notebook
- Install the abovementioned libraries
- Run
data_analysis.ipynb
to reproduce the data analysis results - Run
cluster_analysis.ipynb
to reproduce the cluster analysis results
- Lisa Dratva, lisa.dratva@epfl.ch
- Michal Pleskowicz, michal.pleskowicz@epfl.ch
- Valentin Oliver Loftsson, valentin.loftsson@epfl.ch
This project is licensed under the MIT License - see the LICENSE file for details.
We thank Fabio De Martino, our supervisor at the BRISKEN lab, for his constant guidance and support throughout the learning process.