High-performance automated fancam generator. It takes a standard landscape video and a reference photo of a person (say, your bias), tracks them, and generates a stabilized, vertical (9:16) cropped video locked onto them.
It features a modular Rust core for high-speed video processing, a CLI for batch operations, and a modern Tauri v2 desktop application for easy usage.
- Person Detection: Uses YOLOv8-Nano via ONNX Runtime for fast, accurate person detection.
- Identity Locking: Uses ArcFace (cosine similarity) to distinguish the specific target person from others in the frame.
- Uses your configured
--thresholdvalue end-to-end (CLI + GUI). - Adds relock bias from last known position and adaptive recognition stride for better stability under occlusion.
- Uses your configured
- Identity Discovery Pass (GUI):
- Scans sampled frames first and proposes member thumbnails before tracking begins.
- Supports expected-member-count input and automatic informed rescan when duplicates/count mismatch are detected.
- Adds manual validation controls (
exclude, duplicate resolve, low-confidence confirm) before enabling render. - Lets you choose a target member card; the selected anchor is used as an extra tracking prior alongside the bias image.
- Cinematic Smoothing: Implements a 2D Kalman Filter to smooth camera movements, preventing jittery tracking and simulating a professional camera operator.
- Performance-First Pipeline:
- 3-thread decode/inference/encode pipeline with bounded channels.
- Recognition throttling before and after lock-on to avoid CPU stalls (adaptive while locked).
- Caps ArcFace identity checks to top-confidence person candidates per frame.
- Speeds up large-video processing with detection downscale, parallel tensor prep, and fast SIMD face preprocessing.
- Reuses rendering buffers/resizer state to reduce per-frame allocations.
- Emits periodic per-stage timing logs (
detect,identify,render) for targeted profiling.
- Smart Rendering:
- Automated 1080x1920 cropping.
- SIMD-accelerated resize path (
fast_image_resize) for crop and letterbox operations. - Lanczos3 upscaling for distant subjects.
- Fallback letterboxing when the target is lost/occluded.
- Cross-Platform: Runs on Windows, macOS, and Linux.
The project is organized as a Cargo workspace:
fancam-core/: The engine. Handles FFmpeg transcoding, ONNX inference, Kalman tracking, and image processing.cli/: A command-line interface wrapper for the core engine.src-tauri/&ui/: The Desktop application built with Tauri 2 and Svelte 5.
- Decode: FFmpeg decodes the video stream into RGB frames.
- Detect: YOLOv8 runs inference on the frame to find all "Person" bounding boxes.
- Identify: The system crops faces from top-confidence person boxes and compares their embeddings against the reference "bias" image using ArcFace.
- Track:
- If the target is found, the Kalman filter updates position and velocity.
- Recognition runs at a stride (before and after lock-on) to reduce CPU load.
- If occluded, the filter predicts the position based on previous momentum.
- Render: The frame is cropped to the smoothed coordinates and re-encoded to H.264.
To build and run this project, you need:
- Rust: Stable toolchain (Install).
- Node.js: Required for the UI build steps.
- FFmpeg Libraries: The project links against FFmpeg native libraries.
- Ubuntu/Debian:
sudo apt install libavutil-dev libavformat-dev libavcodec-dev libswscale-dev - macOS:
brew install ffmpeg - Windows: Set
FFMPEG_DIRenvironment variable to your FFmpeg shared build.
- Ubuntu/Debian:
This project requests CoreML execution when available.
- Default lookup path is
models/onnxruntime/lib/libonnxruntime.dylib. - If CoreML is unavailable in your local ONNX Runtime build, inference falls back to CPU (works, but much slower on 4K inputs).
- For best Apple Silicon performance, use an official ONNX Runtime macOS build that includes CoreML support.
-
Clone the repository:
git clone https://github.com/your-username/focus-lock-rs.git cd focus-lock-rs -
Download Models: Create a
models/directory in the root and download the following ONNX models:yolov8n.onnx(YOLOv8 Nano)w600k_mbf.onnx(MobileFaceNet / ArcFace)
-
Build the CLI:
cargo build --release -p cli
The GUI allows you to select files via drag-and-drop and visualize progress.
The recommended GUI flow is now:
-
Select video + bias reference + output.
-
Run Identity Discovery and optionally set the expected member count.
-
Select the proposed member thumbnail to track.
-
Render fancam (blocked until identity review has no unresolved warnings).
-
Install frontend dependencies:
cd ui npm install -
Run in Development Mode:
npm run tauri:dev
For near-production processing speed while iterating UI, use:
npm run tauri:dev:release
-
Build for Production:
npm run tauri:build
The executable will be located in
src-tauri/target/release/bundle/.
The CLI provides direct access to the pipeline phases.
The primary command. It performs detection, identification, tracking, and rendering in one pass.
cargo run --release -p cli -- fancam \
--video "/path/to/concert.mp4" \
--bias "/path/to/face_photo.jpg" \
--output "output_fancam.mp4" \
--yolo-model "models/yolov8n.onnx" \
--face-model "models/w600k_mbf.onnx" \
--threshold 0.6-
Smoke Test (Grayscale): Verifies FFmpeg linkage and basic video I/O.
cargo run -p cli -- gray --input video.mp4 --output gray.mp4
-
Debug Detection: Draws bounding boxes around all detected people without cropping.
cargo run -p cli -- detect --input video.mp4 --output boxes.mp4
Contributions are welcome! Install rustfmt and gimme your PRs.
cargo fmt
cargo test

