Skip to content

tygwan/AgenticLabeling

Repository files navigation

AgenticLabeling

AI-Powered Automatic Labeling Platform

Microservices-based auto-labeling system using Florence-2, SAM2, and DINOv2

FeaturesArchitectureQuick StartDocumentationLicense

Python License Tests Docker


Overview

AgenticLabeling is a comprehensive AI-powered automatic labeling platform built on a microservices architecture. It combines state-of-the-art vision models to provide end-to-end object detection, segmentation, classification, and tracking capabilities.

Key Capabilities

  • Auto-Labeling Pipeline: Image → Detection → Segmentation → Classification → Registry
  • Video Processing: Frame extraction, Re-ID tracking, trajectory visualization
  • Model Training: YOLO training with MLflow experiment tracking
  • Quality Assurance: Streamlit-based validation UI with track visualization
  • Dataset Export: YOLO and COCO format support

Features

AI Models

Model Task Description
Florence-2 Detection Open-vocabulary object detection with grounding
SAM2 Segmentation Instance segmentation with fine masks
DINOv2 Classification Visual embeddings for similarity search
YOLO Training Custom model training and inference

Core Features

  • Object Registry: SQLite + ChromaDB for structured data and vector search
  • Evaluation Agent: mAP, mAP50-95, Confusion Matrix metrics
  • Re-ID Tracker: Appearance-based object tracking across frames
  • Embedding Search: LRU cached similarity search with batch support
  • Track Visualization: Trajectory and timeline views
View Feature Diagram
┌─────────────────────────────────────────────────────────────────┐
│                        API Gateway (8000)                        │
├─────────────────────────────────────────────────────────────────┤
│  /auto-label  │  /detect  │  /segment  │  /train  │  /evaluate  │
└───────┬───────┴─────┬─────┴─────┬──────┴────┬─────┴──────┬──────┘
        │             │           │           │            │
        ▼             ▼           ▼           ▼            ▼
   ┌─────────┐  ┌──────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐
   │Detection│  │Segment-  │ │Classif- │ │Training │ │Evaluation│
   │ Agent   │  │ation     │ │ication  │ │ Agent   │ │  Agent   │
   │Florence2│  │SAM2      │ │DINOv2   │ │YOLO     │ │mAP/CM    │
   └────┬────┘  └────┬─────┘ └────┬────┘ └────┬────┘ └────┬─────┘
        │            │            │           │           │
        └────────────┴────────────┴───────────┴───────────┘
                                  │
                                  ▼
                    ┌─────────────────────────┐
                    │    Object Registry      │
                    │   SQLite + ChromaDB     │
                    └─────────────────────────┘

Architecture

Services

Service Port GPU Description
gateway 8000 API routing and orchestration
detection-agent 8001 Florence-2 object detection
segmentation-agent 8002 SAM2 instance segmentation
classification-agent 8003 DINOv2 embeddings
training-agent 8005 YOLO model training
evaluation-agent 8007 Model evaluation metrics
preprocessing-agent 8008 Video processing & Re-ID tracking
object-registry 8010 SQLite + ChromaDB storage
data-manager 8006 YOLO/COCO dataset export
label-studio-lite 8501 Streamlit validation UI
mlflow 5000 Experiment tracking

Data Flow

Image/Video Input
       │
       ▼
┌──────────────────┐
│  Preprocessing   │ ← Frame extraction, Video processing
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│    Detection     │ ← Florence-2 grounding
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│   Segmentation   │ ← SAM2 instance masks
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Classification  │ ← DINOv2 embeddings
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Object Registry │ ← Store objects, tracks, embeddings
└────────┬─────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌───────┐ ┌───────┐
│Export │ │ Train │
│YOLO/  │ │ YOLO  │
│COCO   │ │ Model │
└───────┘ └───────┘

Quick Start

Prerequisites

  • Python 3.10+
  • Docker & Docker Compose
  • NVIDIA GPU with 8GB+ VRAM (recommended)

Installation

# Clone repository
git clone https://github.com/tygwan/AgenticLabeling.git
cd AgenticLabeling

# Start with Docker Compose
docker-compose up -d

# Check health
curl http://localhost:8000/health

Basic Usage

# Auto-label an image
curl -X POST "http://localhost:8000/auto-label" \
  -F "file=@image.jpg" \
  -F "prompt=person, car, dog"

# Export dataset
curl -X POST "http://localhost:8000/export" \
  -d "dataset_name=my_dataset" \
  -d "format=yolo"

# Train YOLO model
curl -X POST "http://localhost:8000/train/start" \
  -H "Content-Type: application/json" \
  -d '{"dataset_path": "data/datasets/my_dataset", "epochs": 100}'

Access UIs


Documentation

Document Description
Getting Started Installation and setup guide
Architecture Spec Detailed system architecture
Development Progress Project status and roadmap
PRD Product requirements document

API Examples

Auto-Labeling

import httpx

# Label an image
with open("image.jpg", "rb") as f:
    response = httpx.post(
        "http://localhost:8000/auto-label",
        files={"file": f},
        data={"prompt": "person, car, dog", "register": True}
    )
    result = response.json()
    print(f"Detected {len(result['data']['objects'])} objects")

Object Search

# Search similar objects by embedding
response = httpx.post(
    "http://localhost:8000/similar",
    json={"embedding": [...], "top_k": 10}
)
similar_objects = response.json()["data"]

Model Evaluation

# Evaluate detection performance
response = httpx.post(
    "http://localhost:8000/evaluate/detection",
    json={
        "predictions": [...],
        "ground_truth": [...],
        "iou_threshold": 0.5
    }
)
metrics = response.json()["data"]
print(f"mAP: {metrics['mAP']:.3f}")

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=services --cov-report=html

# Test results
# 120 tests passing (unit + integration)

Project Status

Progress: ████████████████████░ 95%
Phase Status Description
Phase 1 ✅ Complete Microservices architecture
Phase 2 ✅ Complete Data management & export
Phase 3 ✅ Complete Video processing & tracking
Phase 4 ✅ Complete Training & evaluation
Phase 5 ✅ Complete Testing (120 tests)
Phase 6 ⏳ Pending Production deployment

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting PRs.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest tests/ -v
  5. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.


Acknowledgments


Made with AI-assisted development

About

Agentic Labeling services for computer vision and VLA(vision language model)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors