FlowFeat: Pixel-Dense Embedding of Motion Profiles

Nikita Araslanov, Anna Sonnweber, Daniel Cremers
NeurIPS 2025 (Spotlight)

[Paper] | [Supplemental Material]

FlowFeat is a pixel-level feature representation learned from optical flow.

Overview

This repository contains the official implementation of our NeurIPS 2025 Spotlight paper.

It includes code for model training, pretrained checkpoints, and a demo notebook.

🚀 Usage

You can load and run FlowFeat directly via PyTorch Hub or from a local clone of this repository.

🔹 Load from PyTorch Hub

import torch

# Load a pretrained FlowFeat model from PyTorch Hub
model = torch.hub.load(
    "tum-vision/flowfeat",
    "flowfeat",
    name="dinov2_vits14_yt",   # model variant
    pretrained=True
)

model.eval()

🔹 Load from a local clone

model = torch.hub.load(
    "./flowfeat",              # path to local repo clone
    "flowfeat",
    name="dinov2_vits14_yt",
    pretrained=True
)

🔹 Supported model variants

dino_vits16_yt
dino_vitb16_yt
dino_vitb16_kt
dinov2_vits14_yt
dinov2_vitb14_yt
dinov2_vitb14_kt

🔹 Example inference

import torch

x = torch.randn(1, 3, 224, 224)  # example input
with torch.no_grad():
    y_enc, y_dec = model(x)

print(y_enc.shape) # encoder features, e.g. (1,384,16,16)
print(y_dec.shape) # decoder features, e.g. (1,128,224,224)

🧰 Pre-trained Models

Model Name (`name`)	Backbone	Train Dataset	Feature Dim	Checkpoint
`dino_vits16_yt`	DINO ViT-S/16	YouTube-VOS	128	Download
`dino_vitb16_yt`	DINO ViT-B/16	YouTube-VOS	128	Download
`dino_vitb16_kt`	DINO ViT-B/16	Kinetics	128	Download
`mae_vitb16_kt`	DINO ViT-B/16	Kinetics	128	Download
`dinov2_vits14_yt`	DINOv2 ViT-S/14	YouTube-VOS	128	Download
`dinov2_vitb14_yt`	DINOv2 ViT-B/14	YouTube-VOS	128	Download
`dinov2_vitb14_kt`	DINOv2 ViT-B/14	Kinetics	128	Download

🔐 Note: Model weights are released under the same license as the codebase. Please cite the paper if you use these in your work.

🏋️‍♀️ Training Instructions

Step 0. Clone the repository

Fetch the repository with the submodules (optical flow networks):

git clone --recurse-submodules https://github.com/tum-vision/flowfeat.git

Step 1. Set up the environment

Create and activate a virtual environment, then install dependencies:

python -m venv flowfeat
source flowfeat/bin/activate  # (use `env\Scripts\activate` on Windows)
pip install -r requirements.txt

Step 2. Set up the data

Follow the data setup instructions to link the required datasets (e.g., DAVIS2017, YouTube-VOS).

Download snapshots of an optical flow network into models/. For example, for RAFT pre-trained on Sintel:

mkdir -p models && wget -O models/raft-sintel.pth <URL-RAFT-Sintel>

Follow the download links here:

Flow Network	Link
RAFT	README
SEA-RAFT	README
SMURF	README

Step 3. Set up Weights & Biases (wandb) and run training

Create a free wandb account. Create an entity and a project.
Run

python train.py --config-name=ytvos.yaml run.wandb_entity="<your_entity>" run.wandb_project="<your_wandb_project>"

Logs, metrics, and checkpoints will be automatically uploaded to your wandb workspace. By default, we evaluate the model on a subset of videos from DAVIS-2017.

🧪 Evaluation

We provide an reference implementation of the attention probe in probe/attention.py. For full evaluation and benchmarking, we used AnyProbe (coming soon).

📚 Citation

If you use this code or pretrained models, please cite our paper:

@inproceedings{Araslanov:2025:FlowFeat,
  author = {Araslanov, Nikita and Sonnweber, Anna and Cremers, Daniel},
  title = {{FlowFeat}: Pixel-Dense Embedding of Motion Profiles},
  booktitle = {NeurIPS},
  year = {2025},
}

_{Acknowledgements:
This work was supported by the ERC Advanced Grant SIMULACRON and DFG project CR 250/26-1 ``4D-YouTube''. We thank the open-source community for tools such as PyTorch and NumPy that made this work possible.}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
config		config
data		data
davis_utils		davis_utils
probes		probes
util		util
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
dataloader.py		dataloader.py
hubconf.py		hubconf.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlowFeat: Pixel-Dense Embedding of Motion Profiles

Overview

🚀 Usage

🧰 Pre-trained Models

🏋️‍♀️ Training Instructions

Step 0. Clone the repository

Step 1. Set up the environment

Step 2. Set up the data

Step 3. Set up Weights & Biases (wandb) and run training

🧪 Evaluation

📚 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

tum-vision/flowfeat

Folders and files

Latest commit

History

Repository files navigation

FlowFeat: Pixel-Dense Embedding of Motion Profiles

Overview

🚀 Usage

🧰 Pre-trained Models

🏋️‍♀️ Training Instructions

Step 0. Clone the repository

Step 1. Set up the environment

Step 2. Set up the data

Step 3. Set up Weights & Biases (wandb) and run training

🧪 Evaluation

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages