AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes [Paper]

In this project, we demonstrate how adversarial auto-encoder (AAE) model can be used to extract the features from high dimensional genetic (omics) data. We evaluated the performance of the model through twelve diﬀerent supervised classiﬁers to verify the usefulness of the new features in breast cancer subtypes prediction.

For biological insight please visit: https://github.com/NeuroSyd/latent-space-discovery

Getting Started

The following instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See the instruction below:

Prerequisites

The following libraries are required to reproduce this project:

Keras 2.0.6 is recommended but works up to 2.1.2
Keras-adverserial (0.0.3) Download Link: https://github.com/bstriner/keras-adversarial
Tensorflow (1.13.1)
Scikit-Learn (0.20.3)
Numpy (1.16.3)
Imbalanced-Learn (0.4.3)

Supports both Python 2.7.0 and Python 3.5.6

Hyperparameters of classifiers are optimized using TPOT https://github.com/EpistasisLab/tpot

Directory Layout

├── results
│   ├── figures
│   │   ├── Here figures will be stored
│   ├── result.tsv
├── data
│   ├── data will be stored here
├── feature_extraction
│   ├── AAE
│   │    ├── 1
│   │    ├── 2
│   │    ├── 3
│   │    ├── 4
│   │    ├── 5          # five folder for five fold cross validation
│   │    ├── fine_tuned
│   │    │    ├── 1
│   │    │    ├── 2
│   │    │    ├── 3
│   │    │    ├── 4
│   │    │    ├── 5
│   ├──deepAE
│   ├──denoisingAE
│   ├──shallowAE
│   ├──VAE
├── README.md
├── notes.txt
└── .gitignore

Make sure to keep the directory as following for other feature extraction methods:

.
├── ...
├── feature_extraction                   
│   ├── denoisingAE       # same for deepAE, shallowAE and VAE
│   │    ├── 1
│   │    ├── 2
│   │    ├── 3
│   │    ├── 4
│   │    ├── 5            
└── ...

Usage

Run the following to train and fine tune the autoencoder

main.py

And run the following when model already fine tuned

without_fine_tuning.py

Benchmarking Code

import timeit
start_time = timeit.default_timer()
import psutil
import os
....................
....................
....................
.... your code .....
....................
....................
....................
###### COMPUTATION TIME ########
print('Wall Clock Time')
print ((end_time - start_time), 'Sec')
time=(end_time - start_time)
minutes = time // 60
time %= 60
seconds = time
print(minutes, 'Minutes', seconds,'Seconds')

########  CPU USAGE #######
print('CPU Usage') 
print(psutil.cpu_percent(), '%')
print('THE END')

To know the MEMORY USAGE please follow the instruction below:

Install memory profiler library: https://pypi.org/project/memory-profiler/ then run the following command.

mprof run main.py

Finally see the memory usage by running:

mprof plot

Proposed Architecture

Datasets

cBioPortal - Cancer Genomics Datasets
Breast Invasive Carcinoma (TCGA, Cell 2015) - Clinical information is used to label various molecular subtypes

Breast Invasive Carcinoma (BRCA)

Molecular Subtypes	Number of Patients	Label
Luminal A	304	0
Luminal B	121	1
Basal & Triple Negetive	137	2
Her 2 Positive	43	3

Total Number of Samples (Patients)	Total Number of Features (Genes)
605	20439

Details about Molecular Subtypes of Breast Cancer

Contribution

If you want to contribute to this project and make it better, your help is very welcome. When contributing to this repository please make a clean pull request.

Acknowledgments

The proposed architecture is inspired by https://github.com/bstriner/keras-adversarial

Tech Stack

Cite Us:

If you find this code useful in your research, please consider citing:

@ARTICLE{9378938,
  author={R. K. {Mondol} and N. D. {Truong} and M. {Reza} and S. {Ippolito} and E. {Ebrahimie} and O. {Kavehei}},
  journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics}, 
  title={AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes}, 
  year={2021},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TCBB.2021.3066086}}

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
datasets		datasets
figures		figures
results/saved_results		results/saved_results
.gitignore		.gitignore
README.md		README.md
_config.yml		_config.yml
aae_architechture_3_layer.py		aae_architechture_3_layer.py
aae_architechture_proposed.py		aae_architechture_proposed.py
benchmarking_main.py		benchmarking_main.py
deep_autoencoder.py		deep_autoencoder.py
deep_denoising_autoencoder.py		deep_denoising_autoencoder.py
denoising_autoencoder.py		denoising_autoencoder.py
main.py		main.py
notes.txt		notes.txt
roc_curve.py		roc_curve.py
shallow_autoencoder.py		shallow_autoencoder.py
variational_autoencoder.py		variational_autoencoder.py
variational_autoencoder_multilayer.py		variational_autoencoder_multilayer.py
without_fine_tuning.py		without_fine_tuning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes [Paper]

Getting Started

Prerequisites

Directory Layout

Usage

Benchmarking Code

Proposed Architecture

Datasets

Contribution

Acknowledgments

Tech Stack

Cite Us:

About

Releases

Packages

Languages

raktim-mondol/breast-cancer-sub-types

Folders and files

Latest commit

History

Repository files navigation

AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes [Paper]

Getting Started

Prerequisites

Directory Layout

Usage

Benchmarking Code

Proposed Architecture

Datasets

Contribution

Acknowledgments

Tech Stack

Cite Us:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages