Breast cancer detection using mammography images, utilizing deep learning models
- Nvidia CUDA drivers
- Install a PyTorch compatible version of CUDA from:
- Your Linux repository
apt install nvidia-cuda-toolkit
- NVIDIA website for Windows and Linux
- Install a PyTorch compatible version of CUDA from:
- Pytorch with CUDA support
- Visit PyTorch website for more information
These two must be installed manually or else will break installation of other requirements later on.
Supported datasets:
- InBreast
- CBIS-DDSM (Curated Breast Imaging Subset of DDSM)
- MIAS (Mammography Image Analysis Society)
Supported models:
- Generally supported models
- Faster R-CNN (Detectron)
- YOLO
- Any model that supports YOLO / COCO style dataset
- Customized UaNet for 2D mammography images
- Use download_datasets_colab.ipynb jupyter notebook in Google Colab to download all datasets.
- You will need to upload your 'kaggle.json' when the notebook gives you an upload dialog.
- After logging in to kaggle, you can get your kaggle json in API section of https://www.kaggle.com/settings.
- The notebook will clone this repository and download all datasets.
Dataset links:
- https://www.kaggle.com/datasets/ramanathansp20/inbreast-dataset
- https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset
- https://www.kaggle.com/datasets/kmader/mias-mammography
Download the above datasets and after cloning this repository, create the following directories:
- breast_cancer_detection/
- datasets/
- all-mias/
- mdb001.pgm
- ...
- CBIS-DDSM/
- csv/
- jpeg/
- INbreast Release 1.0/
- AllDICOMs/
- ...
- all-mias/
- datasets/
Copy datasets to directories accordingly.
After converting the datasets to COCO / YOLO style in the next section (Usage), you may visualize the standardized dataset using the following methods.
python visualizer.py -m coco -d train/images -l train.json
python visualizer.py -m yolo -d train/images -l train/labels
1. Clone this repository
git clone https://github.com/monajemi-arman/breast_cancer_detection
2. Install prerequisites
cd breast_cancer_detection
pip install -r requirements.txt
2. Download the following datasets
https://www.kaggle.com/datasets/ramanathansp20/inbreast-dataset
https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=22516629
https://www.kaggle.com/datasets/kmader/mias-mammography
3. Move dataset files
First create 'datasets' directory:
mkdir datasets/
Then, extract and move the files to this directory so as to have the following inside datasets/:
- INbreast Release 1.0/
- CBIS-DDSM/
- all-mias/
4. Convert datasets to YOLO (and COCO) format
python convert_dataset.py
After completion, images/, labels/, dataset.yaml, annotations.json would be present in the working directory.
5. (optional) Apply additional filters to images
If necessary, you may apply these filters to images using our script: _canny, clahe, gamma, histogram, unsharp
You may enter one of the above filters in command line (-f).
python filters.py -i PATH_TO_IMAGE_DIRECTORY -o OUTPUT_IMAGE_DIRECTORY -f FILTER_NAME
- Install Ultralytics
pip install ultralytics
- Train your desired YOLO model
yolo train data=dataset.yaml model=yolov8n
Example of prediction using YOLO ultralytics framework:
yolo predict model=runs/detect/train/weights/best.pt source=images/cb_1.jpg conf=0.1
The purpose of detectron.py is to train and evaluate a Faster R-CNN model and predict using detectron2 platform.
python detectron.py -c train
- Visualize model prediction
- Show ground truth and labels
- Filter predictions by confidence score
# After training is complete
python detectron.py -c predict -w output/model_final.pth -i <image path>
# -w: path to model weights
- Run train step as explained above
- Copy 'detectron.cfg.pkl' and the last model checkpoint to webapp/ directory.
* Last model checkpoint file name is written in output/last_checkpoint - Run the following:
cd webapp/
python web.py
- Then visit http://127.0.0.1:33517
- (optional) Use API
If you wish, API is also available, example:
# Run server
cd webapp/
python web.py
# Get predictions
curl -X POST \
-F "file=@input.jpg" \
http://localhost:33517/api/v1/predict \
| jq -r '.data.inferred_image' | base64 --decode > prediction.jpg
- Calculate mAP
- Uses test dataset by default
python detectron.py -c evaluate -w output/model_final.pth
- Suitable for later offline metrics calculation
- All predictions of the test dataset will be written to predicions.json
- Follows COCO format
python detectron.py -c evaluate_test_to_coco -w output/model_final.pth
- Clone UaNet repository (patched)
# Make sure you cd to breast_cancer_detection first
# cd breast_cancer_detection
git clone https://github.com/monajemi-arman/UaNet_2D
- Prepare dataset
# Convert datasets to images/ masks/
python convert_dataset.py -m mask
# Convert to 3D NRRD files
python to_3d_nrrd.py
- Move dataset to model directory
# While in breast_cancer_detection directory
mv UaNet-dataset/* UaNet_2D/data/preprocessed/
# Remove old default configs of UaNet
mv split/* UaNet_2D/src/split/
- Start training
cd UaNet_2D/src
python train.py