There are various techniques that are used in computer vision tasks such as classification, semantic segmentation, object detection and instance segmentation. Instance Segmentation is identifying each object instance for every known object within an image. It assigns a label to each pixel of the image.
The Mask R-CNN algorithm was introduced by He et al. in their 2017 paper, Mask R-CNN. It is an instance segmentation technique which locates each pixel of every object in the image instead of the bounding boxes. It has three main stages:
- Backbone network which is a standard CNN such as ResNet50 or ResNet101. It is used to generate the feature maps.
- Region proposal network (RPN) to propose candidate object bounding boxes.It uses a CNN to generate the multiple Region of Interest(RoI) using a lightweight binary classifier.
- RoI Align network outputs multiple bounding boxes and warps them into a fixed dimension. Warped features are then fed into fully connected layers to make classification (using Softmax) and boundary box prediction (using regression). The features are also fed into Mask classifier, which consists of two CNN’s, to output a binary mask for each RoI. Mask Classifier generates masks for every class without competition among classes.
The images have been taken from the Leukocyte Images for Segmentation and Classification Database (LISC). The images have to segmented into these 5 types of WBC's:
- Basophil
- Eosinophil
- Neutrophil
- Lymphocyte
- Monocyte
- Download/fork Matterport's Mask R-CNN.
- Download the training images and divide them into train and validation set.
- In the root directory of Mask R-CNN creating a folder named WBC consisting of images and their corresponding masks. It's structure should be as follows:
WBC
├──train(same for val)
│ ├──image
│ │ ├──Basophil
│ │ │ ├──Basophil_01.png
│ │ │ └── ...
│ │ ├──Eosinophil
│ │ │ ├──Eosinophil_01.png
│ │ │ └── ...
│ │ .
│ │ .
│ │ .
│ ├──mask
│ │ ├──Basophil
│ │ │ ├──Basophil_01.png
│ │ │ └── ...
│ │ ├──Eosinophil
│ │ │ ├──Eosinophil_01.png
│ │ │ └── ...
│ │ .
│ │ .
└── └── .
- Download the pre-trained COCO weights(mask_rcnn_coco.h5) and save them in the root directory of Mask R-CNN.
- Also save the
WBC.py
file in this repository into the Mask R-CNN folder. - To start training, open terminal in the folder and write
python3 WBC.py train --dataset=WBC --weights=coco
The model was trained for 75 epochs with 60 steps per epoch. The following are some predictions of the model on images in the validation set:
The LISC data has some images without any ground truth mask which was used as the test set. Here are the model's predictions on the test set:
Now some examples in which the model failed to correctly predict the WBC and it's type:
The model is able to locate the WBC's correctly but labels them incorrectly as Monocyte and Basophil. Both the WBC's are Neutrophil.
In this case, the model is unable to detect the second Neutrophil in the image.
This was an interesting case. As depicted both the models are unable to detect all the WBC's in the image. Training the model for more epochs would have possibly resulted in better predictions.
(Training the model for 75 epochs took 27 hours on my CPU)
- https://github.com/matterport/Mask_RCNN
- https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46
- https://www.pyimagesearch.com/2019/06/10/keras-mask-r-cnn/
- https://towardsdatascience.com/computer-vision-instance-segmentation-with-mask-r-cnn-7983502fcad1
- https://towardsdatascience.com/instance-segmentation-using-mask-r-cnn-7f77bdd46abd