Skip to content

A new architecture of feature extraction method using omics data is proposed which improves classification in most of the classifiers.

Notifications You must be signed in to change notification settings

raktim-mondol/breast-cancer-sub-types

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes [Paper]

License: CC BY 4.0 contribution python version keras version tensorflow version imblearn version

In this project, we demonstrate how adversarial auto-encoder (AAE) model can be used to extract the features from high dimensional genetic (omics) data. We evaluated the performance of the model through twelve different supervised classifiers to verify the usefulness of the new features in breast cancer subtypes prediction.

For biological insight please visit: https://github.com/NeuroSyd/latent-space-discovery

project_logo_transparent

Getting Started

The following instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See the instruction below:

Prerequisites

The following libraries are required to reproduce this project:

  1. Keras 2.0.6 is recommended but works up to 2.1.2

  2. Keras-adverserial (0.0.3) Download Link: https://github.com/bstriner/keras-adversarial

  3. Tensorflow (1.13.1)

  4. Scikit-Learn (0.20.3)

  5. Numpy (1.16.3)

  6. Imbalanced-Learn (0.4.3)

Supports both Python 2.7.0 and Python 3.5.6

Hyperparameters of classifiers are optimized using TPOT https://github.com/EpistasisLab/tpot

Directory Layout

├── results
│   ├── figures
│   │   ├── Here figures will be stored
│   ├── result.tsv
├── data
│   ├── data will be stored here
├── feature_extraction
│   ├── AAE
│   │    ├── 1
│   │    ├── 2
│   │    ├── 3
│   │    ├── 4
│   │    ├── 5          # five folder for five fold cross validation
│   │    ├── fine_tuned
│   │    │    ├── 1
│   │    │    ├── 2
│   │    │    ├── 3
│   │    │    ├── 4
│   │    │    ├── 5
│   ├──deepAE
│   ├──denoisingAE
│   ├──shallowAE
│   ├──VAE
├── README.md
├── notes.txt
└── .gitignore

Make sure to keep the directory as following for other feature extraction methods:

.
├── ...
├── feature_extraction                   
│   ├── denoisingAE       # same for deepAE, shallowAE and VAE
│   │    ├── 1
│   │    ├── 2
│   │    ├── 3
│   │    ├── 4
│   │    ├── 5            
└── ...

Usage

Run the following to train and fine tune the autoencoder

main.py

And run the following when model already fine tuned

without_fine_tuning.py

Benchmarking Code

import timeit
start_time = timeit.default_timer()
import psutil
import os
....................
....................
....................
.... your code .....
....................
....................
....................
###### COMPUTATION TIME ########
print('Wall Clock Time')
print ((end_time - start_time), 'Sec')
time=(end_time - start_time)
minutes = time // 60
time %= 60
seconds = time
print(minutes, 'Minutes', seconds,'Seconds')

########  CPU USAGE #######
print('CPU Usage') 
print(psutil.cpu_percent(), '%')
print('THE END')

To know the MEMORY USAGE please follow the instruction below:

Install memory profiler library: https://pypi.org/project/memory-profiler/ then run the following command.

mprof run main.py

Finally see the memory usage by running:

mprof plot

Proposed Architecture

AFExNET

Datasets

Breast Invasive Carcinoma (BRCA)

Molecular Subtypes Number of Patients Label
Luminal A 304 0
Luminal B 121 1
Basal & Triple Negetive 137 2
Her 2 Positive 43 3
Total Number of Samples (Patients) Total Number of Features (Genes)
605 20439

Contribution

If you want to contribute to this project and make it better, your help is very welcome. When contributing to this repository please make a clean pull request.

Acknowledgments

Tech Stack

tech_stack_banner

Cite Us:

alt text If you find this code useful in your research, please consider citing:

@ARTICLE{9378938,
  author={R. K. {Mondol} and N. D. {Truong} and M. {Reza} and S. {Ippolito} and E. {Ebrahimie} and O. {Kavehei}},
  journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics}, 
  title={AFExNet: An Adversarial Autoencoder for Differentiating Breast Cancer Sub-types and Extracting Biologically Relevant Genes}, 
  year={2021},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TCBB.2021.3066086}}

About

A new architecture of feature extraction method using omics data is proposed which improves classification in most of the classifiers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%