Skip to content

kristjansoln/unet-segmentation

Repository files navigation

Semantic Segmentation With U-Net

This is an implementation of U-Net architecture for semantic segmentation of water bodies. Semantic segmentation is a computer vision task that involves classifying each pixel in an image as belonging to a particular class. For more information, refer to report.pdf.

Dataset preparation and required structure

The expected directory structure is displayed below. The images directory contains all images from your dataset in jpg format and the masks directory should contain corresponding binary masks in png format. An example dataset is provided in dataset_example.

.
└── dataset
    ├── images
    │   ├── 007100.jpg
    │   ├── 007110.jpg
    │   ├── 007120.jpg
    │   └── ...
    ├── masks
    │   ├── 007100.png
    │   ├── 007110.png
    │   ├── 007120.png
    │   └── ...
    ├── test.csv
    ├── train.csv
    └── val.csv

Masks are expected to be as generated by the WaSR algorithm (example). If you intend to use different types of masks, modify the DatasetFolder.get_item method in dataset loader to load them properly.

If you intend to use different image file formats, modify the DatasetFolder.make_dataset method in dataset loader to properly generate filenames.

The train-val-test split is defined in the corresponding csv files. An example of such file is displayed below. Entries refer to files in the images directory, hence the jpg format.

007550.jpg
007560.jpg
007570.jpg
007580.jpg
007590.jpg
007600.jpg
...

Usage

Clone the project, create a virtual environment and install required dependencies:

git clone https://github.com/kristjansoln/unet-segmentation.git
cd unet-segmentation
python3 -m venv venv
source venv/bin/activate
pip install requirements.txt

Install torch and torchvision as described in PyTorch documentation.

Run python3 train.py --help for help on all available arguments. Few usage examples are shown below.

python3 train.py --train --test # Run both training and testing with default arguments
python3 train.py --train --batchsize 4 --imagesize 288 512 --epochs 150 # Run training only, with modified batch size, image size and number of epochs
python3 train.py --test --testcsv ./dataset2/test.csv --imagesize 288 512 # Run test only. Weights are loaded from ./outupt/weights.pth

Logs are stored in ./output/training.log. When training, some graphs and model weights are saved to ./output. Training can be safely interrupted with Ctrl+C.

During testing, weights are loaded from the default location. Generated output masks are stored in ./output/generated_masks.

Dataset augmentations can be modified. By default, images in the train dataset are randomly flipped horizontally, center rotated for up to 5 degrees, normalized and randomly changed in brightness and contrast. See Albumentations for more.

Initial learning rate is set to 10E-4. A ReduceLROnPlateau scheduler is implemented that reduces that learning rate to 10E-5 once the validation loss plateaus. When val. loss plateaus again, an early stop mechanism stops the training. The patience in both cases is set to 5 epochs.

About

Water segmentation using U-Net.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages