🧦 Laundromat - Sock Pair Detection

Automatically detect and highlight matching sock pairs in laundry using computer vision. Uses SAM3 for segmentation and ResNet18 for feature matching.

Architecture

Laundromat uses a client-server architecture:

Server: Runs SAM3 + ResNet inference (GPU or CPU). Can be on localhost or a remote machine.
Client: Captures video/camera, sends frames to server, receives results, performs optical flow tracking locally.

This separation allows running the heavy ML models on a powerful machine while the client can be a lightweight device (laptop, phone, etc.).

Quick Start

1. Start the Server

# Download SAM3 model weights (one-time setup)
# Place sam3.pt in server/models/

# Start server with Docker
cd server
docker-compose up -d

# Check server is running
curl http://localhost:8080/health

2. Run the Client

# Install client dependencies
pip install -r requirements.txt

# Process a video file
python main.py --server http://localhost:8080 --video laundry_pile.mp4

# Or use camera
python main.py --server http://localhost:8080 --camera 0

Project Structure

laundromat/
├── main.py                 # Client entry point
├── requirements.txt        # Client dependencies (lightweight!)
├── src/laundromat/         # Core client library
│   ├── backends.py         # Server communication
│   ├── config.py           # Configuration
│   ├── tracking.py         # Optical flow tracking
│   ├── video_processor.py  # Main processing pipeline
│   └── visualization.py    # Overlay rendering
├── server/                 # Inference server
│   ├── Dockerfile
│   ├── docker-compose.yml
│   ├── app.py              # FastAPI REST API
│   ├── inference_service.py
│   └── requirements.txt    # Server dependencies (includes PyTorch)
└── web-client/             # Optional browser client
    ├── index.html
    └── app.js

Client Usage

# Basic usage - server required
python main.py --server http://localhost:8080 --video input.mp4

# With camera
python main.py --server http://localhost:8080 --camera 0

# Options
python main.py --server http://localhost:8080 --video input.mp4 \
    --output output.mp4 \    # Output file (default: output.mp4)
    --pairs 3 \              # Number of pairs to detect
    --refresh 2.0 \          # Seconds between inference calls
    --no-preview \           # Disable preview window
    --no-record              # Don't save output (camera only)

Server API

The server exposes a REST API:

GET /health - Health check
POST /infer - Run inference on a frame
- Parameters: top_n_pairs, detection_prompt
- Body: multipart form with frame (JPEG image)
- Returns: JSON with masks (RLE encoded), boxes, labels, tracking points

Remote Server

You can run the server on a remote machine with GPU:

# On the server machine
cd server
docker-compose up -d

# On the client machine
python main.py --server http://192.168.1.100:8080 --camera 0

Web Client

A browser-based client is available at http://localhost:8080/client/ when the server is running. This allows using a phone camera directly.

How It Works

Segmentation: SAM3 segments all socks in the frame using text prompts
Feature Extraction: ResNet18 extracts visual features from each sock
Pair Matching: Cosine similarity finds the most similar pairs
Tracking: Optical flow tracks socks between inference frames
Visualization: Matching pairs are highlighted with colored overlays

Development

Client Dependencies

The client is lightweight - no PyTorch required:

numpy, opencv-python, Pillow, requests

Server Dependencies

The server requires the full ML stack:

PyTorch, ultralytics (SAM3), torchvision

Building the Server

cd server
docker-compose build

Fine-Tuning for Better Sock Matching

The default ResNet50 features work well for general sock matching, but for better accuracy with similar-looking socks (e.g., distinguishing between multiple grey or white socks), you can train a custom projection head on your specific socks.

Training Data Structure

Organize your training images like this:

testing/data/socks/
├── grey/
│   ├── sock6/
│   │   ├── photo1.jpg
│   │   ├── photo2.jpg
│   │   └── photo3.jpg
│   ├── sock7/
│   │   └── ...
│   └── ...
└── white/
    ├── sock1/
    │   └── ...
    └── ...

Each sock gets its own folder with 3-5 photos from different angles
Group similar-colored socks in parent folders (grey/, white/) for hard negative mining
Use JPEG images with the sock visible against a plain background

Training the Projection Head

# Basic training (uses default settings)
python -m src.laundromat.finetune.train \
    --data testing/data/socks \
    --output server/models/sock_projection_head.pt

# Advanced options
python -m src.laundromat.finetune.train \
    --data testing/data/socks \
    --output server/models/sock_projection_head.pt \
    --epochs 100 \          # Max training epochs (default: 100)
    --patience 20 \         # Early stopping patience (default: 20)
    --lr 0.001 \            # Learning rate (default: 0.001)
    --margin 0.3 \          # Triplet loss margin (default: 0.3)
    --triplets 500 \        # Triplets per epoch (default: 500)
    --batch-size 16         # Batch size (default: 16)

Training output:

Model saved to server/models/sock_projection_head.pt
Best pair accuracy printed at end (aim for >95%)
Training uses MPS (Mac), CUDA (GPU), or CPU automatically

Using the Fine-Tuned Model

The trained projection head is automatically loaded when:

The server starts (if server/models/sock_projection_head.pt exists)
Tests run (via the projection_head fixture)

No code changes needed - just place the trained model in the right location.

Tips for Better Results

More photos per sock: 5-7 photos from different angles helps
Consistent lighting: Take photos in similar lighting conditions
Include edge cases: Folded, stretched, inside-out views
Balance categories: Similar number of socks per color group

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.clinerules		.clinerules
scripts		scripts
server		server
src/laundromat		src/laundromat
testing		testing
web-client		web-client
.gitignore		.gitignore
README.md		README.md
laundry_pairs_output.mp4		laundry_pairs_output.mp4
laundry_pile.jpg		laundry_pile.jpg
laundry_pile.mp4		laundry_pile.mp4
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧦 Laundromat - Sock Pair Detection

Architecture

Quick Start

1. Start the Server

2. Run the Client

Project Structure

Client Usage

Server API

Remote Server

Web Client

How It Works

Development

Client Dependencies

Server Dependencies

Building the Server

Fine-Tuning for Better Sock Matching

Training Data Structure

Training the Projection Head

Using the Fine-Tuned Model

Tips for Better Results

License

About

Uh oh!

Releases

Packages

Languages

idofrizler/laundromat

Folders and files

Latest commit

History

Repository files navigation

🧦 Laundromat - Sock Pair Detection

Architecture

Quick Start

1. Start the Server

2. Run the Client

Project Structure

Client Usage

Server API

Remote Server

Web Client

How It Works

Development

Client Dependencies

Server Dependencies

Building the Server

Fine-Tuning for Better Sock Matching

Training Data Structure

Training the Projection Head

Using the Fine-Tuned Model

Tips for Better Results

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages