Computer Vision (Façade Segmentation Optimization)

Introduction
Research Questions
Underlying Data
Proceeding & Methodologies
Results
Authors

Background

In 2021, Switzerland launched a comprehensive climate strategy aimed at achieving net-zero emissions by 2050, with specific reduction targets in sectors such as buildings, transport, and agriculture.
The building sector is a focal point as it consumes about 40% of the nation's total energy and emits one-third of its domestic CO2. To meet the 2050 targets, the country needs to gather improved data on the condition of buildings and their owners,focusing especially on thermal insulation quality.

Traditional assessment methods like thermal imaging face challenges due to the large number of buildings and limited equipment. However, infrared thermography has emerged as a promising technique for large-scale assessments, helping to identify buildings with poor insulation and high energy loss.

To conduct these large-scale assessments effectively, a critical preliminary step involves segmenting the building façade into different parts, e.g. window, balcony and other parts. This segmentation allows for a more targeted approach when using thermal image analysis, focusing only on the effective areas of the building façade.

Introduction

In initial, heuristic tests, two deep learning methods showed promising results for segmenting building facades:\

Mask R-CNN (Detectron2 implementation from Meta/FAIR)
YOLOv8 (from Ultralytics)

The primary objective of this research project is to identify the most effective deep learning method to segment the buildings into:
Façade
Window
Balcony
...

This will involve optimizing both approaches through additional image preparation steps and then comparing and validating these methods using appropriate metrics and techniques.

Research Questions

The research questions are as follows:

Do CLAHE, greyscale, and built-in augmentation techniques influence the mAP50, segmentation, and class loss of deep learning models Detectron2 (Mask R-CNN) and Ultralytics (YOLOv8) in the task of building façade segmentation?
To what extent do optimized hyperparameters enhance the performance of Detectron2 (Mask R-CNN) and Ultralytics (YOLOv8) in segmenting building façades in terms of mAP50, segmentation, and class loss?
In the context of building façade segmentation, what is the comparative effectiveness of Detectron2 (Mask R-CNN) and Ultralytics (YOLOv8) in terms of mAP50, segmentation and class loss?

The following metrics were used to evaluate model performance:

Loss: A measure of the model's optimization performance, with lower values indicating better convergence during training.
Segmentation Loss: Similar to the loss, focusing on how well the model segments objects in the validation dataset.
mAP50 (segmentation mask): mAP (mean Average Precision) at 50% Intersection over Union (IoU) threshold. Measures precision and recall for segmentation tasks with higher values indicating better segmentation performance..

These metrics are selected based on:

Quantitative Assessment: mAP50 and loss metrics offer a numerical way to evaluate the performance of segmentation models. mAP50 assesses object segmentation, while loss measures optimization progress.
Objective Comparison: Loss metrics provide an objective, consistent basis for comparing segmentation models, aiding in selecting the best one.
Informative Evaluation: Multiple metrics offer a comprehensive view of segmentation model performance, helping understand trade-offs and make informed decisions.

Underlying Data

The dataset is hosted on Roboflow - Project: building-facade-segmentation-instance.
It contains 598 annotated images of building façades.

Classes:

balcony-fence
car
facade
fence
non-building-infrastructure
shop
street
traffic-infrastructure
vegetation
window

Additional datasets are not considered and used for testing/validation.

Detectron2

Part 1 to 5:

YOLOv8

Model: https://docs.ultralytics.com/models/yolov8/
Segmentation: https://docs.ultralytics.com/tasks/segment/
Train: https://docs.ultralytics.com/modes/train/

Methodology & Proceeding

Visual and exploratory dataset analysis was initially performed through a Jupyter Notebook.
Subsequently, the following steps were executed for both YOLO and Mask-R-CNN frameworks:

Training the "base" model.
Training with CLAHE (Contrast Limited Adaptive Histogram Equalization) enhancements.
Training in greyscale.
Selection of the best "base" model based on AP50 (Average Precision at 50% Intersection over Union). -Incorporating framework augmentations, such as flip, crop, and rotation.
Comparing the performance of the best base model with the framework-augmented model, again using AP50 as the metric.
Applying hyperparameter tuning to further optimize the best model's AP50.
Training a YOLO model with the best-performing hyperparameters.
Comparing the performance of YOLOv8 with the Mask-R-CNN model, once again using AP50 as the evaluation metric.

All of these steps were executed for both YOLO and Mask-R-CNN frameworks. A train/valid/test split was applied consistently during these training phases, with k-fold cross-validation planned for later stages which has not been implemented so far.

Results

Results were visualized in CometML and can be found under: https://www.comet.com/syx/hslu-computer-vision/reports/hslu-computer-vision-facade-segmentation Overall, the YOLOv8 Augmented Model performed better than the Mask R-CNN Model in the observed metrics.

Authors

Lukas Zurbriggen & Tim Giger
Hochschule Luzern, M.Sc. in Applied Information & Data Science
Module: Computer Vision

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
YOLOv8		YOLOv8
images		images
mask-r-cnn		mask-r-cnn
plots		plots
.gitignore		.gitignore
README.md		README.md
dataset_exploration.ipynb		dataset_exploration.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer Vision (Façade Segmentation Optimization)

Background

Introduction

Research Questions

Underlying Data

Detectron2

YOLOv8

Methodology & Proceeding

Results

Authors

About

Releases

Packages

Contributors 2

Languages

syX113/hslu-cv-facades

Folders and files

Latest commit

History

Repository files navigation

Computer Vision (Façade Segmentation Optimization)

Background

Introduction

Research Questions

Underlying Data

Detectron2

YOLOv8

Methodology & Proceeding

Results

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages