Skip to content

This repository contains a semantic segmentation model trained on the FloodNet dataset. The model is capable of classifying each pixel in an image into one of 10 predefined classes. The project utilizes the Dice Score as the evaluation metric.

Notifications You must be signed in to change notification settings

shamiul5201/Semantic-Segmentation-for-FloodNet-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Semantic Segmantation for FloodNet Dataset

23

Overview

FloodNet is a semantic segmentation project designed to identify flooded and non-flooded regions in aerial imagery. The project uses deep learning techniques based on the DeepLabV3+ architecture using TensorFlow and Keras. to classify each pixel in an image into predefined categories (e.g., water, Tree, Vehicle).

This notebook implements the end-to-end pipeline for:

  • An EfficientNetV2-S backbone pretrained on ImageNet for feature extraction.
  • Dilated Spatial Pyramid Pooling (ASPP) for capturing multiscale context.
  • A custom decoder to upsample features to the input image resolution.
  • Preparing and preprocessing data.
  • Training and validating the segmentation model.
  • Evaluating performance metrics (e.g., Dice Coefficient, IoU).
  • Performing inference on test images with visualized outputs.

Classes:

The dataset includes 10 classes, each representing different features in flood-affected areas:

Class ID Class Name
0 Background
1 Building Flooded
2 Building Non-Flooded
3 Road Flooded
4 Road Non-Flooded
5 Water
6 Tree
7 Vehicle
8 Pool
9 Grass

Code Breakdown

Input Specification

The model accepts images of a specific shape (shape) defined during initialization. The input layer:

model_input = Input(shape=shape)

ensures compatibility with the EfficientNetV2 backbone.

Backbone Network

EfficientNetV2-S is employed for extracting hierarchical feature maps:

backbone = tf.keras.applications.EfficientNetV2S(include_top=False, weights="imagenet", input_tensor=model_input)
backbone.trainable = True

The backbone's key feature maps are accessed for multiscale processing:

  • block6b_expand_activation (smallest spatial resolution, richest features)
  • block4b_expand_activation
  • block3b_expand_activation
  • block2b_expand_activation (largest spatial resolution)

Dilated Spatial Pyramid Pooling (ASPP)

For multiscale context, the ASPP block processes outputs from deeper layers:

input_a = DilatedSpatialPyramidPooling(input_a, num_filters=256)

Each ASPP block includes parallel dilated convolutions with varying dilation rates.

Upsampling and Decoder

Outputs from the ASPP and other backbone layers are upsampled to match the desired resolution:

input_a = UpSampling2D(size=(16, 16), interpolation="bilinear")(input_a)

Final Layer

The last layer applies a 1x1 convolution and softmax activation:

outputs = Conv2D(num_classes, kernel_size=(1, 1), padding="valid", activation="softmax")(x)

This generates a segmentation map with probabilities across all classes.

Training Hyperparameter:

  • Epochs: 40
  • Batch Size: 2
  • Optimizer: AdamW with initial learning rate of 1e-3 and weight decay of 1e-5
  • Learning Rate Scheduler: ReduceLROnPlate

Metrics include:

  • Dice Coefficient: Measures overlap between predicted and true masks.
  • Intersection over Union (IoU): Computes pixel-wise intersection vs union for each class.

Loss and Metric Plots

Screenshot 2024-11-14 at 10 45 59 am

Inference and Visualization

prediction

Submission

Screenshot 2024-11-14 at 10 48 14 am

About

This repository contains a semantic segmentation model trained on the FloodNet dataset. The model is capable of classifying each pixel in an image into one of 10 predefined classes. The project utilizes the Dice Score as the evaluation metric.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published