Skip to content

Recreating scientific paper "TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images".

Notifications You must be signed in to change notification settings

Ilnicki010/tda-net-covid-classification

Repository files navigation

I tried to recreate results from "TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images" paper by Mustafa Hajij, Ghada Zamzmi, and Fawwaz Batayneh published on 3 Aug 2021.

Goal

The problem we try to solve is a supervised binary classification of chest X-ray photos. There are 2 separate classes:

  • "Covid" - patient is affected by COVID-19
  • "Normal" - patient is healthy

Model's input: black and white X-ray image of a chest Model's output: one of two classes: "Covid" or "Normal"

Dataset

I wasn't able to exactly recreate the dataset used in a paper. The proposed dataset was built from two publicly available databases:

  1. positive cases were taken from: https://github.com/ieee8023/covid-chestxray-dataset
  2. normal cases: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia

I couldn't reproduce the original dataset because the first dataset is dynamic and changed over time and the second is a big set of cases so the authors picked a random sample from it.

Dataset used for recreation

I decided to go with this dataset from Kaggle: https://www.kaggle.com/datasets/fusicfenta/chest-xray-for-covid19-detection

It's based on the same two data sources the original work is based on. It's balanced and contains 288 images for training and 60 in a validation set.

TDA

TDA (Topological Data Analysis) - way for analyzing (usually high-dimensional) data using topological features.

gif

GIF from https://towardsdatascience.com/persistent-homology-with-examples-1974d4b9c3d0

Proposed networks

The authors proposed 3 architectures of neural networks using TDA and 1 base CNN.

Base CNN

original:
Screenshot 2023-05-04 at 21 58 30

implemented:
baseline_model

$TDA-Net_{1}$

original:
Screenshot 2023-05-04 at 20 46 07

implemented:
first_tda_net

$TDA-Net_{1,2}$

original:
Screenshot 2023-05-05 at 23 26 13

implemented:
second_tda_net_model

$TDA-Net_{1,2,3}$

original:
Screenshot 2023-05-11 at 17 22 22

implemented:
third_tda_net_model

Results and conclusions

The end results in the original paper look like this:

Base model $TDA-Net_{1}$ $TDA-Net_{1,2}$ $TDA-Net_{1,2,3}$
Accuracy 0.87 0.89 0.92 0.93
Precision 0.84 0.84 0.95 0.88
Recall 0.87 0.87 0.85 0.95
f-1 score 0.86 0.86 0.90 0.92
TNR 0.89 0.88 0.97 0.91

However, in my implementation I got the following results:

Base model $TDA-Net_{1}$ $TDA-Net_{1,2}$ $TDA-Net_{1,2,3}$
Accuracy 0.97 0.85 0.90 0.97
Precision 0.97 0.82 0.88 1.0
Recall 0.97 0.90 0.93 0.93
f-1 score 0.97 0.86 0.90 0.97
TNR 0.97 0.8 0.87 1.0

Setting up

  1. git clone https://github.com/Ilnicki010/tda-net-covid-classification.git
  2. cd tda-net-covid-classification
  3. Create data folder with datasets from here: https://www.kaggle.com/datasets/fusicfenta/chest-xray-for-covid19-detection
  4. pip install -r requirements.txt
  5. Open main.ipynb run and analyze all cells

About

Recreating scientific paper "TDA-Net: Fusion of Persistent Homology and Deep Learning Features for COVID-19 Detection From Chest X-Ray Images".

Topics

Resources

Stars

Watchers

Forks