Skip to content

AI Breast cancer detection using InBreast, CBIS-DDSM, MIAS mammography image datasets

License

Notifications You must be signed in to change notification settings

monajemi-arman/breast_cancer_detection

Repository files navigation

Breast_Cancer_Detection

Breast cancer detection using mammography images, utilizing deep learning models

Prerequisites

  • Nvidia CUDA drivers
    • Install a PyTorch compatible version of CUDA from:
      • Your Linux repository
      apt install nvidia-cuda-toolkit
      
  • Pytorch with CUDA support

These two must be installed manually or else will break installation of other requirements later on.

Datasets

Supported datasets:

  • InBreast
  • CBIS-DDSM (Curated Breast Imaging Subset of DDSM)
  • MIAS (Mammography Image Analysis Society)

Supported models:

  • Generally supported models
    • Faster R-CNN (Detectron)
    • YOLO
    • Any model that supports YOLO / COCO style dataset
  • Customized UaNet for 2D mammography images

Download

Google Colab

  • Use download_datasets_colab.ipynb jupyter notebook in Google Colab to download all datasets.
  • You will need to upload your 'kaggle.json' when the notebook gives you an upload dialog.
  • After logging in to kaggle, you can get your kaggle json in API section of https://www.kaggle.com/settings.
  • The notebook will clone this repository and download all datasets.

Manual

Dataset links:

Download the above datasets and after cloning this repository, create the following directories:

  • breast_cancer_detection/
    • datasets/
      • all-mias/
        • mdb001.pgm
        • ...
      • CBIS-DDSM/
        • csv/
        • jpeg/
      • INbreast Release 1.0/
        • AllDICOMs/
        • ...

Copy datasets to directories accordingly.

Visualizer

After converting the datasets to COCO / YOLO style in the next section (Usage), you may visualize the standardized dataset using the following methods.

COCO Style dataset

python visualizer.py -m coco -d train/images -l train.json 

YOLO Style dataset

python visualizer.py -m yolo -d train/images -l train/labels 

Usage

1. Clone this repository

git clone https://github.com/monajemi-arman/breast_cancer_detection

2. Install prerequisites

cd breast_cancer_detection
pip install -r requirements.txt

2. Download the following datasets
https://www.kaggle.com/datasets/ramanathansp20/inbreast-dataset
https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=22516629
https://www.kaggle.com/datasets/kmader/mias-mammography

3. Move dataset files
First create 'datasets' directory:

mkdir datasets/

Then, extract and move the files to this directory so as to have the following inside datasets/:

  • INbreast Release 1.0/
  • CBIS-DDSM/
  • all-mias/

4. Convert datasets to YOLO (and COCO) format

python convert_dataset.py

After completion, images/, labels/, dataset.yaml, annotations.json would be present in the working directory.

5. (optional) Apply additional filters to images
If necessary, you may apply these filters to images using our script: _canny, clahe, gamma, histogram, unsharp
You may enter one of the above filters in command line (-f).

python filters.py -i PATH_TO_IMAGE_DIRECTORY -o OUTPUT_IMAGE_DIRECTORY -f FILTER_NAME

YOLO

Training

  • Install Ultralytics
pip install ultralytics
  • Train your desired YOLO model
yolo train data=dataset.yaml model=yolov8n

Prediction

Example of prediction using YOLO ultralytics framework:

yolo predict model=runs/detect/train/weights/best.pt source=images/cb_1.jpg conf=0.1 

Detectron (Faster R-CNN)

Train

The purpose of detectron.py is to train and evaluate a Faster R-CNN model and predict using detectron2 platform.

python detectron.py -c train

Predict

  • Visualize model prediction
  • Show ground truth and labels
  • Filter predictions by confidence score
# After training is complete
python detectron.py -c predict -w output/model_final.pth -i <image path>
# -w: path to model weights

detectron prediction visualizer

Web Application

Usage

  1. Run train step as explained above
  2. Copy 'detectron.cfg.pkl' and the last model checkpoint to webapp/ directory.
    * Last model checkpoint file name is written in output/last_checkpoint
  3. Run the following:
cd webapp/
python web.py
  1. Then visit http://127.0.0.1:33517

  1. (optional) Use API
    If you wish, API is also available, example:
# Run server
cd webapp/
python web.py

# Get predictions
curl -X POST \
  -F "file=@input.jpg" \
  http://localhost:33517/api/v1/predict \
  | jq -r '.data.inferred_image' | base64 --decode > prediction.jpg

Evaluate

Evaluation using COCOEvaluator

  • Calculate mAP
  • Uses test dataset by default
python detectron.py -c evaluate -w output/model_final.pth

Save predictions in COCO style JSON (optional)

  • Suitable for later offline metrics calculation
  • All predictions of the test dataset will be written to predicions.json
  • Follows COCO format
python detectron.py -c evaluate_test_to_coco -w output/model_final.pth

UaNet (Deprecated)

Training

  • Clone UaNet repository (patched)
# Make sure you cd to breast_cancer_detection first
# cd breast_cancer_detection
git clone https://github.com/monajemi-arman/UaNet_2D
  • Prepare dataset
# Convert datasets to images/ masks/
python convert_dataset.py -m mask
# Convert to 3D NRRD files
python to_3d_nrrd.py
  • Move dataset to model directory
# While in breast_cancer_detection directory
mv UaNet-dataset/* UaNet_2D/data/preprocessed/
# Remove old default configs of UaNet
mv split/* UaNet_2D/src/split/
  • Start training
cd UaNet_2D/src
python train.py