Skip to content

This project is a segmentation model. Using U-Net as an encoder-decoder architecture to segment medical images of inside of digestion system.

Notifications You must be signed in to change notification settings

TheRNB/KvasirSegmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

KvasirSegmentation

This is an implementation of U-Net model for Kvasir-SEG dataset for Polyp Segmentation in internal digestion system photos.

Execution

The requirements are as follows:

  matplotlib==3.8.2
  numpy==1.26.4
  scikit-image==0.22.0
  scikit-learn==1.3.2
  tensorflow==2.15.0
  tqdm==4.64.1

Description

Dataset

This model is trained on the Kvasir-SEG dataset freely available for educaitonl use on here.

Model Design

This model uses the U-Net architecture introduced on this wonderful paper by Ronneberger. My implementation consists of 3 different sized models to test the effect of size on segmentation accuracy and overall accuracy:

  • The first implementation of the UNet model is an encoder-decoder with these layers:

    • Encoder:
      • 2 × ((3, 3), 16) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 32) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 64) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 128) Convulotion layer + (2, 2) Maxpooling layer
    • Bottleneck:
      • 2 × ((3,3), 256) Convulotion
    • Decoder:
      • ((3, 3), 128) Convolution Transpose layer + 2 × ((3, 3), 128) Convulotion + (2, 2) Maxpooling layer
      • ((3, 3), 64) Convolution Transpose layer + 2 × ((3, 3), 64) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 32) Convolution Transpose layer + 2 × ((3, 3), 32) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 16) Convolution Transpose layer + 2 × ((3, 3), 16) Convulotion + (2, 2) Maxpooling layer
  • The second implementation is a lopsided encoder-decoder with these layers:

    • Encoder:
      • 2 × ((3, 3), 32) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 64) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 128) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 256) Convulotion layer + (2, 2) Maxpooling layer
    • Bottleneck:
      • 2 × ((3,3), 512) Convulotion
    • Decoder:
      • ((3, 3), 128) Convolution Transpose layer + 2 × ((3, 3), 256) Convulotion + (2, 2) Maxpooling layer
      • ((3, 3), 64) Convolution Transpose layer + 2 × ((3, 3), 128) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 32) Convolution Transpose layer + 2 × ((3, 3), 64) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 16) Convolution Transpose layer + 2 × ((3, 3), 32) Convulotion + (2, 2) Maxpooling layer
  • The third and final implementation is an encoder-decoder with these layers:

    • Encoder:
      • 2 × ((3, 3), 16) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 32) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 64) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 128) Convulotion layer + (2, 2) Maxpooling layer
      • 2 × ((3, 3), 256) Convulotion layer + (2, 2) Maxpooling layer
    • Bottleneck:
      • 2 × ((3,3), 512) Convulotion
    • Decoder:
      • ((3, 3), 256) Convolution Transpose layer + 2 × ((3, 3), 256) Convulotion + (2, 2) Maxpooling layer
      • ((3, 3), 128) Convolution Transpose layer + 2 × ((3, 3), 128) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 64) Convolution Transpose layer + 2 × ((3, 3), 64) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 32) Convolution Transpose layer + 2 × ((3, 3), 32) Convulotion + (2, 2) Maxpooling layer
      • ((3,3), 16) Convolution Transpose layer + 2 × ((3, 3), 16) Convulotion + (2, 2) Maxpooling layer

Results

Using these models we can achieve these accuracies on test data:

Model 1 Model 2 Model 3
Accuracy 92.5312% 91.5110% 92.8794%

Also calculating the DICE scores on test data, we get:

Model 1 Model 2 Model 3
DICE 72.56 71.31 76.60

Conclusion

As seen by accuracy values, it seems that size has an indistinguishable difference on the accuracy, as all numbers are in margin of error of each other. However, as clearly seen by the DICE metric, The bigger models help with finding the location of the polyp more accurately.

Contact

If there are any problems or questions about the model, feel free to reach out at aaron@bateni.org.

About

This project is a segmentation model. Using U-Net as an encoder-decoder architecture to segment medical images of inside of digestion system.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published