focus-lock-rs

High-performance automated fancam generator. It takes a standard landscape video and a reference photo of a person (say, your bias), tracks them, and generates a stabilized, vertical (9:16) cropped video locked onto them.

It features a modular Rust core for high-speed video processing, a CLI for batch operations, and a modern Tauri v2 desktop application for easy usage.

Features

Person Detection: Uses YOLOv8-Nano via ONNX Runtime for fast, accurate person detection.
Identity Locking: Uses ArcFace (cosine similarity) to distinguish the specific target person from others in the frame.
- Uses your configured --threshold value end-to-end (CLI + GUI).
- Adds relock bias from last known position and adaptive recognition stride for better stability under occlusion.
Identity Discovery Pass (GUI):
- Scans sampled frames first and proposes member thumbnails before tracking begins.
- Supports expected-member-count input and automatic informed rescan when duplicates/count mismatch are detected.
- Adds manual validation controls (exclude, duplicate resolve, low-confidence confirm) before enabling render.
- Lets you choose a target member card; the selected anchor is used as an extra tracking prior alongside the bias image.
Cinematic Smoothing: Implements a 2D Kalman Filter to smooth camera movements, preventing jittery tracking and simulating a professional camera operator.
Performance-First Pipeline:
- 3-thread decode/inference/encode pipeline with bounded channels.
- Recognition throttling before and after lock-on to avoid CPU stalls (adaptive while locked).
- Caps ArcFace identity checks to top-confidence person candidates per frame.
- Speeds up large-video processing with detection downscale, parallel tensor prep, and fast SIMD face preprocessing.
- Reuses rendering buffers/resizer state to reduce per-frame allocations.
- Emits periodic per-stage timing logs (detect, identify, render) for targeted profiling.
Smart Rendering:
- Automated 1080x1920 cropping.
- SIMD-accelerated resize path (fast_image_resize) for crop and letterbox operations.
- Lanczos3 upscaling for distant subjects.
- Fallback letterboxing when the target is lost/occluded.
Cross-Platform: Runs on Windows, macOS, and Linux.

Architecture

The project is organized as a Cargo workspace:

fancam-core/: The engine. Handles FFmpeg transcoding, ONNX inference, Kalman tracking, and image processing.
cli/: A command-line interface wrapper for the core engine.
src-tauri/ & ui/: The Desktop application built with Tauri 2 and Svelte 5.

Logic Flow

Decode: FFmpeg decodes the video stream into RGB frames.
Detect: YOLOv8 runs inference on the frame to find all "Person" bounding boxes.
Identify: The system crops faces from top-confidence person boxes and compares their embeddings against the reference "bias" image using ArcFace.
Track:
- If the target is found, the Kalman filter updates position and velocity.
- Recognition runs at a stride (before and after lock-on) to reduce CPU load.
- If occluded, the filter predicts the position based on previous momentum.
Render: The frame is cropped to the smoothed coordinates and re-encoded to H.264.

Prerequisites

To build and run this project, you need:

Rust: Stable toolchain (Install).
Node.js: Required for the UI build steps.
FFmpeg Libraries: The project links against FFmpeg native libraries.
- Ubuntu/Debian: sudo apt install libavutil-dev libavformat-dev libavcodec-dev libswscale-dev
- macOS: brew install ffmpeg
- Windows: Set FFMPEG_DIR environment variable to your FFmpeg shared build.

ONNX Runtime provider note (macOS)

This project requests CoreML execution when available.

Default lookup path is models/onnxruntime/lib/libonnxruntime.dylib.
If CoreML is unavailable in your local ONNX Runtime build, inference falls back to CPU (works, but much slower on 4K inputs).
For best Apple Silicon performance, use an official ONNX Runtime macOS build that includes CoreML support.

Installation & Setup

Clone the repository:

git clone https://github.com/your-username/focus-lock-rs.git
cd focus-lock-rs

Download Models: Create a models/ directory in the root and download the following ONNX models:
- yolov8n.onnx (YOLOv8 Nano)
- w600k_mbf.onnx (MobileFaceNet / ArcFace)
Build the CLI:
```
cargo build --release -p cli
```

Desktop Application (GUI)

The GUI allows you to select files via drag-and-drop and visualize progress.

The recommended GUI flow is now:

Select video + bias reference + output.
Run Identity Discovery and optionally set the expected member count.
Select the proposed member thumbnail to track.
Render fancam (blocked until identity review has no unresolved warnings).
Install frontend dependencies:
```
cd ui
npm install
```
Run in Development Mode:
```
npm run tauri:dev
```
For near-production processing speed while iterating UI, use:
```
npm run tauri:dev:release
```
Build for Production:
```
npm run tauri:build
```
The executable will be located in src-tauri/target/release/bundle/.

CLI Usage

The CLI provides direct access to the pipeline phases.

Generate a Fancam

The primary command. It performs detection, identification, tracking, and rendering in one pass.

cargo run --release -p cli -- fancam \
  --video "/path/to/concert.mp4" \
  --bias "/path/to/face_photo.jpg" \
  --output "output_fancam.mp4" \
  --yolo-model "models/yolov8n.onnx" \
  --face-model "models/w600k_mbf.onnx" \
  --threshold 0.6

Other Commands

Smoke Test (Grayscale): Verifies FFmpeg linkage and basic video I/O.
```
cargo run -p cli -- gray --input video.mp4 --output gray.mp4
```
Debug Detection: Draws bounding boxes around all detected people without cropping.
```
cargo run -p cli -- detect --input video.mp4 --output boxes.mp4
```

Contributing

Contributions are welcome! Install rustfmt and gimme your PRs.

cargo fmt
cargo test

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.cargo		.cargo
cli		cli
fancam-core		fancam-core
src-tauri		src-tauri
src		src
ui		ui
.DS_Store		.DS_Store
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
bias_face.jpg		bias_face.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

focus-lock-rs

Features

Architecture

Logic Flow

Prerequisites

ONNX Runtime provider note (macOS)

Installation & Setup

Desktop Application (GUI)

CLI Usage

Generate a Fancam

Other Commands

Contributing

About

Uh oh!

Releases

Packages

Languages

License

wheevu/focus-lock-rs

Folders and files

Latest commit

History

Repository files navigation

focus-lock-rs

Features

Architecture

Logic Flow

Prerequisites

ONNX Runtime provider note (macOS)

Installation & Setup

Desktop Application (GUI)

CLI Usage

Generate a Fancam

Other Commands

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages