Cosmos SafetyNet — Physical AI Safety Reasoning

Cosmos Reason 2 (CR2) is the system: one model, multiple safety reasoning tasks on video, delivered as structured JSON plus <think> reasoning traces.

This submission is scoped to a forklift-safety demo built from five short warehouse incident clips.

Requirements

Python 3.11+
An OpenAI-compatible chat-completions endpoint serving nvidia/Cosmos-Reason2-8B
- This project was built/tested against a Nebius-managed vLLM deployment (see .env.example)

Setup

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
copy .env.example .env

Edit .env with your endpoint URL and API key.

Endpoint contract (what you need to run this)

This repo assumes an OpenAI-compatible endpoint that supports:

POST /v1/chat/completions
Multimodal user messages where content is a list containing:
- {"type":"video_url","video_url":{"url":"data:video/mp4;base64,..."}}
- {"type":"text","text":"<prompt>"}

We do not fine-tune CR2; everything here is prompt + pipeline orchestration.

Configuration used (inference defaults)

Defaults are defined in src/config.py (and can be overridden via environment variables):

NEBIUS_VLLM_MODEL=nvidia/Cosmos-Reason2-8B
NEBIUS_VLLM_TEMPERATURE=0.6
NEBIUS_VLLM_TOP_P=0.95
NEBIUS_VLLM_TOP_K=20
NEBIUS_VLLM_MAX_TOKENS=1600
Multimodal sampling: NEBIUS_VLLM_MM_FPS=6, NEBIUS_VLLM_DO_SAMPLE_FRAMES=true

CLI

Analyze a single video:

python -m src.cli analyze --mode forklift --video "data/videos/forklift safety/VID1 A Forklift Accident Near Miss - Kyle Thill (240p, h264).mp4"

Modes: forklift, load, safety, security, timeline, full

Run the submission batch manifest:

python -m src.cli batch --manifest .\batch\batch_manifest_forklift_vid1_5.yaml --force

Render the exact request payload (offline; no Nebius calls):

python -m src.cli render --mode forklift --video "data/videos/forklift safety/VID1 A Forklift Accident Near Miss - Kyle Thill (240p, h264).mp4"

Parse a saved raw output into JSON + <think> (offline):

python -m src.cli parse --raw "outputs/forklift_vid1_5/per_stream/VID1 A Forklift Accident Near Miss - Kyle Thill (240p, h264).mp4__forklift.raw.txt"

Run evaluation (offline, against the hand-labeled clips included in data/ground_truth/):

python -m src.cli eval --results .\outputs\forklift_vid1_5\per_stream --ground-truth .\data\ground_truth --out .\outputs\forklift_vid1_5\eval_report.json

Generate a human-readable Markdown report (offline):

python -m src.cli report --results .\outputs\forklift_vid1_5\per_stream --out .\reports\runs\forklift_demo_report.md

Generate per-clip near-miss reports (offline; one report per passing video + a JSON manifest):

python -m src.cli near-miss --results .\outputs\forklift_vid1_5\per_stream --out-dir .\reports\runs\near_miss

Generate a self-contained HTML dashboard viewer (offline):

python -m src.cli dashboard --results .\outputs\forklift_vid1_5\per_stream --videos-dir "data/videos/forklift safety" --out .\reports\runs\ops_center_view.html --title "Cosmos SafetyNet — Forklift Demo"

Open the generated *.html file in your browser, then use the “Load local video” picker to attach the corresponding clip for click-to-seek timelines.

Run tests:

python -m unittest discover -s tests

Prompting Convention

We follow the NVIDIA Cosmos reason prompt guide: media-first ordering plus the standard reasoning suffix appended to the user prompt.

Practical Video Payload Limits (Fail Loudly)

Local videos are embedded as base64 data: URIs. To avoid silent request-size failures, we refuse to embed very large local files by default.

Override with NEBIUS_VLLM_MAX_VIDEO_MB (default: 25)
For long clips, create a short excerpt with ffmpeg and analyze that instead.

Submission contents

data/videos/forklift safety/ contains the five submission clips (VID1 through VID5)
outputs/forklift_vid1_5/per_stream/ contains the saved JSON, raw, and think artifacts
reports/runs/ops_center_view.html is the self-contained dashboard used for review
submission/VIDEO_SOURCES.md lists the source citations for each included video

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cosmos SafetyNet — Physical AI Safety Reasoning

Requirements

Setup

Endpoint contract (what you need to run this)

Configuration used (inference defaults)

CLI

Prompting Convention

Practical Video Payload Limits (Fail Loudly)

Submission contents

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
batch		batch
data		data
fixtures/raw		fixtures/raw
outputs/forklift_vid1_5		outputs/forklift_vid1_5
reports/runs		reports/runs
src		src
submission		submission
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Cosmos SafetyNet — Physical AI Safety Reasoning

Requirements

Setup

Endpoint contract (what you need to run this)

Configuration used (inference defaults)

CLI

Prompting Convention

Practical Video Payload Limits (Fail Loudly)

Submission contents

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages