Skip to content

Faheem8585/drone-segmentation

Repository files navigation

🚁 Semantic Segmentation of Aerial Drone Imagery

A deep learning project to perform semantic segmentation on aerial drone imagery using a custom U-Net architecture. This project successfully segments urban scenes into categories like roads, buildings, vegetation, trees, and cars.

Python PyTorch License

📊 Project Results

My model achieves strong performance on real-world aerial data, specifically optimized to detect rare classes like trees and cars.

Metric Score Description
Mean IoU 47.9% Intersection over Union (averaged)
F1 Score 62.2% Harmonic mean of precision and recall
Pixel Accuracy 76.5% Global pixel-wise accuracy

Class-wise Performance

Class IoU Score Status
🛣️ Roads 76.9% ✅ Excellent
🌿 Vegetation 67.1% ✅ Excellent
🏢 Buildings 56.0% ✅ Good
🚗 Cars 28.2% 🚀 Great (up from 0%)
🌳 Trees 22.5% 🚀 Great (up from 0.3%)

📂 Dataset

This project uses the Semantic Drone Dataset from Kaggle.

  • Source: Aerial drone photography of urban environments.
  • Content: 400 high-resolution RGB images.
  • Labels: Pixel-precise semantic masks (24 classes originally, simplified to 6 for this project).

Class Mapping

I simplified the 24 original classes into 6 core categories for better training stability:

  1. Roads (paved areas, dirt, gravel)
  2. Buildings (roofs, walls, fences)
  3. Vegetation (grass, low vegetation)
  4. Trees (trees, bushes)
  5. Cars (cars, bicycles)
  6. Background (water, people, obstacles, unlabeled)

🧠 Model Architecture

I implemented a U-Net architecture from scratch in PyTorch.

  • Encoder: 4 downsampling blocks (Conv2d + BatchNorm + ReLU + MaxPool).
  • Decoder: 4 upsampling blocks with skip connections to preserve spatial detail.
  • Output: 1x1 Convolution to map features to 6 class probabilities.
  • Input Size: 128x128 (optimized for CPU training).

Key Improvements

To handle class imbalance (e.g., Roads are 67% of pixels, Cars are 0.2%), I implemented:

  1. Weighted Cross Entropy Loss: Heavily penalized missing rare classes (Cars: 50x weight, Trees: 15x weight).
  2. Data Augmentation: Random horizontal/vertical flips, rotations (±15°), and brightness/contrast adjustments.
  3. Early Stopping: Monitored validation loss with patience of 7 epochs.

🛠️ Installation

  1. Clone the repository

    git clone https://github.com/yourusername/drone-segmentation.git
    cd drone-segmentation
  2. Install dependencies

    pip install -r requirements.txt
  3. Download Dataset You need a Kaggle API key (kaggle.json).

    python download_kaggle_dataset.py

    Alternatively, download manually from Kaggle and place in data/kaggle_raw.

  4. Prepare Data Converts raw Kaggle masks to the 6-class format.

    python prepare_kaggle_data.py

🚀 Usage

Training

Train the model on your local machine (CPU/GPU).

# Train for 50 epochs with batch size 8
python train_fast.py --data data/real --epochs 50 --batch_size 8

Outputs are saved to outputs/best_model_fast.pth.

Evaluation

Evaluate the trained model on the test set and generate visualization overlays.

python evaluate_real.py --data data/real

Results saved to outputs/real_drone_report.md and outputs/overlays_real/.


📁 Project Structure

.
├── drone_segmentation/      # 📦 Core Python package
│   ├── data.py             # Dataset loading & augmentation
│   ├── model.py            # U-Net architecture definition
│   └── utils.py            # Visualization helpers
├── data/                   # 💾 Data storage
│   ├── kaggle_raw/         # Original downloaded data
│   └── real/               # Processed ready-to-train data
├── outputs/                # 📤 Results
│   ├── best_model_fast.pth # Trained model weights
│   └── overlays_real/      # Segmentation visualization images
├── train_fast.py           # 🚂 Training script
├── evaluate_real.py        # 📊 Evaluation script
├── prepare_kaggle_data.py  # 🔄 Data preprocessing script
└── requirements.txt        # 📋 Dependencies

🔮 Future Improvements

  • GPU Training: Scale up to 512x512 image resolution.
  • Advanced Models: Implement DeepLabV3+ or SegFormer.
  • More Classes: Separate 'Water' and 'Structures' into their own classes.

Created by Muhammad Faheem Arshad

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages