All the code for my reseach paper: Machine Learning Models and Neural Networks for Hepatocellular carcinoma Microarray Diagnosis
-
Download dataset and save it as ~/microarray-ml/db/Liver_GSE14520_U133A.csv
-
Install dependencies
-
Run main.ipynb
- gene_optimizer.py - program to determine genes with the best accuracy and exports the results to genes.txt
- genes.txt - file containing genes in order of accuracy
- main.ipynb - main file to run (contains comments for ease of experimentation and explanations)
This paper examines the effectiveness of using machine learning algorithms including a neural network on a hepatocellular carcinoma (HCC) microarray dataset. The dataset contained 358 samples and over 22,000 features (genes) and was pre-processed for ML applications. This dataset had been extensively used to compare the accuracy of various ML classification models. This paper proposes a new method; the utilization of ML models and neural networks could contribute to more affordable HCC diagnosis and could also be applied to other types of cancer. The discovery of a direct relationship between the best-performing genes and HCC demonstrated that machine learning could be a powerful tool for HCC diagnosis. The exceptional results produced by the neural network proved that neural networks could be the future for HCC diagnosis.