Skip to content

LUPI-OD - A novel methodology to improve object detection accuracy without increasing model size or complexity. | Master of Science Dissertation | University of Malta

License

Notifications You must be signed in to change notification settings

mbar0075/lupi-for-object-detection

Repository files navigation

Learning Using Privileged Information for Object Detection

This repository contains LUPI-OD, the first method to apply Learning Using Privileged Information (LUPI) to Object Detection (OD). It improves performance without increasing model size, making it ideal for applications that demand lightweight, efficient solutions.

Python 3.9+ MIT License

Datasets Used for Evaluation:

Bottles in the Wild UAVVaste AVCLE SODA PASCAL VOC 2012

Popular Object Detection Models Used:

Faster R-CNN SSD RetinaNet SSDLite FCOS

🔗 Table of Contents

Click-to-View

📄 Abstract

Object detection is widely recognised as a foundational task within computer vision, with applications spanning automation, medical imaging, and surveillance. Although numerous models and methods have been developed, attaining high detection accuracy often requires the utilisation of complex model architectures, especially those based on transformers. These models typically demand extensive computational resources for inference and large-scale annotated datasets for training, both of which contribute to the overall difficulty of the task.

To address these challenges, this work introduces a novel methodology incorporating the Learning Using Privileged Information (LUPI) paradigm within the object detection domain. The proposed approach is compatible with any object detection architecture and operates by introducing privileged information to a teacher model during training. This information is then distilled into a student model, resulting in more robust learning and improved generalisation without increasing the number of model parameters and complexity.

The methodology is evaluated on general-purpose object detection tasks and a focused case study involving litter detection in visually complex, highly variable outdoor environments. These scenarios are especially challenging due to the target objects' small size and inconsistent appearance. Evaluation is conducted both within individual datasets and across multiple datasets to assess consistency and generalisation. A total of 120 models are trained, covering five well-established object detection architectures. Four datasets are used in the evaluation: three focused on UAV-based litter detection and one drawn from the Pascal VOC 2012 benchmark to assess performance in multi-label detection and generalisation.

Experimental results consistently demonstrate improvements in detection accuracy across all model types and dataset conditions when employing the LUPI framework. Notably, the approach yields increases of 0.02 to 0.15 in the strict mean Average Precision (mAP)@50-95 metric, highlighting its robustness across both general-purpose and domain-specific tasks. In nearly all cases, we observed performance improvements, indicating that the proposed methodology achieves such results without increasing the number of parameters or altering the model architecture. This supports its viability as a lightweight and effective modification to existing object detection systems.

⚙️ LUPI-OD Architecture

Architecture Architecture

🧪 Methodology

This method leverages the Learning Using Privileged Information (LUPI) paradigm to boost object detection performance by providing extra supervision during training. Privileged information is fed to a teacher model and then distilled into a student model. The key steps are:

  1. Generating Privileged Information:
    For every image, a single-channel bounding box mask is created as additional supervisory input.

  2. Training the Teacher Model:
    The teacher model receives both the original image and the privileged mask as multi-channel input. It is trained to predict object classes alongside the bounding box masks.

  3. Distilling Knowledge to the Student Model:
    The student model learns from the teacher’s soft labels. A loss function based on cosine distance between the final backbone layer features of both models guides the student to match the teacher’s internal representations.

%%{init: {
  "themeVariables": {
    "fontSize": "16px",
    "edgeLabelFontSize": "14px",
    "edgeLabelColor": "#37474F",
    "primaryColor": "#6A1B9A",
    "primaryBorderColor": "#4A148C",
    "secondaryColor": "#81C784",
    "secondaryBorderColor": "#388E3C",
    "tertiaryColor": "#FFB74D",
    "tertiaryBorderColor": "#F57C00",
    "background": "#FFFFFF",
    "textColor": "#212121"
  }
}}%%

flowchart TD
  classDef privileged fill:#FFB74D,stroke:#F57C00,stroke-width:2px,color:#5D4037,font-weight:bold;
  classDef teacher fill:#81C784,stroke:#388E3C,stroke-width:2px,color:#1B5E20,font-weight:bold;
  classDef student fill:#6A1B9A,stroke:#4A148C,stroke-width:2px,color:#D1C4E9,font-weight:bold;
  classDef step fill:#E3F2FD,stroke:#90CAF9,stroke-width:1px,color:#0D47A1;

  %% Step 1 - Privileged Info
  PI[Step 1: Generate Privileged Information]:::privileged
  PI1[Create single-channel bounding box mask for each image]:::step

  %% Step 2 - Teacher Model Training
  TM[Step 2: Train Teacher Model]:::teacher
  TM1["Input: Original image and privileged mask (multi-channel)"]:::step
  TM2[Output: Predict object classes and bounding box masks]:::step

  %% Step 3 - Student Distillation
  SK[Step 3: Distill Knowledge to Student Model]:::student
  SK1[Train student on teacher's soft labels]:::step
  SK2[Use cosine distance loss on latent features to align representations]:::step

  %% Layout connections
  PI --> PI1
  PI1 --> TM
  TM --> TM1
  TM --> TM2

  TM1 --> SK
  TM2 --> SK

  SK --> SK1
  SK --> SK2

Loading

🎯 Contributions of This Research

  • Introducing LUPI to Object Detection
    This research demonstrates how integrating the Learning Using Privileged Information (LUPI) paradigm into object detection—particularly for litter detection—can enhance performance without changing the model architecture or affecting inference speed.

  • Enhanced Accuracy in Litter Detection and Localisation
    Results show significant improvements in detecting litter, especially smaller objects. The approach yields stronger gains in binary object localisation and also improves multi-label detection performance.

  • Model-Agnostic Improvements
    The approach works effectively across multiple detection models without increasing the number of parameters or inference time. While training time rises due to the teacher model, inference remains efficient during deployment.

  • Strong Generalization Across Litter Datasets
    Extensive testing confirms that the approach generalizes well within the primary litter detection dataset and across others, improving detection of small and partially occluded objects in varied scenarios.

  • Broader Impact on Object Detection Tasks
    Beyond litter detection, the technique enhances multi-label detection performance on general object detection datasets. However, accuracy tends to decrease as the number of object classes grows.

📈 Main Detection Results

🛩️ UAV-Based Litter Detection: Within-Dataset Evaluation

SODA: Small Objects at Different Altitudes (Low-Altitudes)

SODA 01m

Visual Results

Basline Results Best Student Results
Baseline Results Student Results (Ours)

SODA: Small Objects at Different Altitudes (All-Altitudes)

SODA 01m

Visual Results

Basline Results Best Student Results
Baseline Results Student Results (Ours)


🌍 UAV-Based Litter Detection: Across-Dataset Evaluation

BDW: Bottle Detection in the Wild Using Low-Altitude Unmanned Aerial Vehicles

SODA 01m

Visual Results

Basline Results Best Student Results
Baseline Results Student Results (Ours)

UAVVaste: Vision‐Based Trash and Litter Detection in Low Altitude Aerial Images

SODA 01m

Visual Results

Basline Results Best Student Results
Baseline Results Student Results (Ours)


🏷️ Multi-label Object Detection: Pascal VOC 2012 Evaluation

Pascal VOC

Pascal VOC CM

Visual Results

Basline Results Best Student Results
Basline Results Best Student Results
Basline Results Best Student Results
Baseline Results Student Results (Ours)


🧮 Model Size Comparison on Pascal VOC 2012

Model Config Size (MB) Params (M) Classes Channels
Faster R-CNN Baseline 157.92 41.40 21 3
Student 157.92 41.40 21 3
RetinaNet Baseline 124.22 32.56 21 3
Student 124.22 32.56 21 3
FCOS Baseline 122.48 32.11 21 3
Student 122.48 32.11 21 3
SSD Baseline 100.27 26.29 21 3
Student 100.27 26.29 21 3
SSD Lite Baseline 9.42 2.47 21 3
Student 9.42 2.47 21 3

🔍 Other Detection Results

🛠️ Preliminary Experiment for Privileged Information Selection

Privileged Information Channels
Explored Privileged Information Channels

Privileged Information Results
Preliminary Experiment Results


📊 Teacher Model Performance on Pascal VOC 2012

All results shown below reflect the performance of teacher models across key object detection metrics:

Model mAP@50-95 mAP@50 mAP@75 mAR@1 mAR@10 mAR@100 Precision Recall F1 Score
RetinaNet 0.77 0.86 0.79 0.60 0.81 0.81 0.26 0.90 0.38
FCOS 🥈 0.80 0.88 0.82 0.61 0.84 0.84 0.43 0.91 0.56
Faster R-CNN 🥇 0.77 0.91 0.82 0.59 0.82 0.82 0.56 0.91 0.68
SSD 0.42 0.56 0.49 0.41 0.48 0.48 0.25 0.69 0.36
SSDLite 0.49 0.61 0.54 0.46 0.55 0.55 0.04 0.79 0.07

💾 Installation

Prerequisites

Python 3.9+
CUDA-capable GPU (recommended)

Clone the Repository

git clone https://github.com/mbar0075/lupi-for-object-detection.git
cd lupi-for-object-detection
pip install -r requirements.txt

🎓 About This Research

This research was carried out at the University of Malta and submitted in partial fulfilment of the requirements for the Master of Science Degree by Research. It was supervised by Dr. Dylan Seychell and Dr. Konstantinos Makantasis. The full master’s dissertation, which includes the research question, background, methodology, evaluation, and analysis, can be downloaded below.

📘 Citation

🎓 Dissertation

@mastersthesis{bartolo2025privilegedinfo,
  title={Investigating the Role of Learning using Privileged Information in Object Detection},
  author={Bartolo, Matthias},
  type={{M.Sc.} thesis},
  year={2025},
  school={University of Malta}
}

📄 Research Paper

The main findings of this research have also been accepted at the 2025 IEEE 13th European Workshop on Visual Information Processing (EUVIP 2025)

@misc{bartolo2025learningusingprivilegedinformation,
      title={Learning Using Privileged Information for Litter Detection}, 
      author={Matthias Bartolo and Konstantinos Makantasis and Dylan Seychell},
      year={2025},
      eprint={2508.04124},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.04124}, 
}

🪪 License

This project is licensed under the MIT License. See the LICENSE file for details.

✉️ Contact

For questions, collaboration, or feedback, please contact Matthias Bartolo

About

LUPI-OD - A novel methodology to improve object detection accuracy without increasing model size or complexity. | Master of Science Dissertation | University of Malta

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages