AI-Powered Optimal Palletizing with Physical Reasoning — powered by NVIDIA Cosmos Reason2 and MuJoCo physics simulation.
Evaluates 10 industrial stacking patterns through physics simulation (shake & tilt tests), then uses a 3-round VLM tournament where Cosmos Reason2 designs progressively harder tests to find the optimal pattern — and explains why.
Doosan Robotics is a global leader in collaborative robotics, operating across 45+ countries. Recognized with the CES 2026 Best of Innovation Award (AI category), Doosan Robotics is an NVIDIA strategic partner with an existing cuMotion collaboration — this project extends that partnership into AI-powered palletizing with Cosmos Reason2.
The global palletizing market exceeds $150 billion, yet 23% of warehouse injuries involve falling pallets. Fixed stacking rules can't adapt to diverse box sizes or off-center weight. Without physics validation, failures are discovered on the warehouse floor — not in simulation.
| Phase | What Happens |
|---|---|
| Think | Cosmos Reason2 analyzes box physics across 10 candidate patterns |
| Verify | MuJoCo simulates shake & tilt stress tests with iterative tournament rounds |
| Act | The verified winning pattern is ready for execution with full traceability |
GENERATE → TOURNAMENT (3 rounds) → EXPLAIN
↕ each round: SIMULATE + Reasoning VLM TUNE
| Stage | Description |
|---|---|
| Generate | Deterministic coordinate generation for 10 stacking patterns (<10ms) |
| Tournament | 3-round elimination: R1 (10→5), R2 (5→2), R3 (2→1). Each round runs MuJoCo shake + tilt tests. From R2, Cosmos Reason2 analyzes score discriminability and proposes updated physics parameters for the next round |
| Explain | Cosmos Reason2 generates an engineering rationale for the winner with transport condition analysis |
All elimination decisions use physics-only scoring:
| Component | Weight |
|---|---|
| Stability | 50% |
| Space | 50% |
Reasoning VLM does not score patterns — it only tunes physics parameters between rounds (TUNE) and generates the final engineering rationale (EXPLAIN).
- 4-tier auto-detection: HuggingFace → Custom API → GPU (vLLM) → Ollama fallback
- No API key required for Ollama backend — runs fully local
- VLM Coach TUNE: Cosmos doesn't just judge patterns — it designs harder tests each round
- iso_view renders: Alternating-color isometric renders optimized for VLM visual reasoning
Column Stack · Full Interlock · Partial Interlock · Brick · Pinwheel · Hybrid Pinwheel · Split Row · Split Block · Row · Spiral
12 pallet presets (EUR, GMA, Asia, AU, NA standards) → Pallet Standards Reference
| Metric | Value |
|---|---|
| Winner | Full Interlock (94.2 confidence) |
| Space Utilization | 87% (vs 71% for Row) |
| Patterns Evaluated | 10 deterministic patterns |
| Tournament Rounds | 3 adaptive rounds |
| Test Suite | 561+ tests, zero mocks |
| Pattern Generation | under 10ms |
Left: Column Stack | Right: Pinwheel — MuJoCo pallet shake simulation
🎬 Watch on YouTube — 2:30 min demo of the full pipeline
# 1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh && source ~/.bashrc
# 2. Clone & install
git clone https://github.com/doosan-robotics/palletizing-ai.git
cd palletizing-ai
uv python install 3.12 # required if your system has Python < 3.12
uv sync # base dependencies
uv sync --extra local # + HuggingFace/PyTorch for local AI inference
# 3. HuggingFace login (Cosmos Reason2 is a gated model)
# Get token: https://huggingface.co/settings/tokens
# Request access: https://huggingface.co/nvidia/Cosmos-Reason2-8B
uv run huggingface-cli login --token YOUR_HF_TOKEN
# 4. Configure environment
cp .env.example .env
# Edit .env → set COSMOS_BACKEND=huggingface and HF_TOKEN=your_token
# 5. Launch
uv run python scripts/run_app.pyThe dashboard runs without AI (simulation-only mode). AI features activate on first use (~16 GB download).
| Tab | Purpose |
|---|---|
| 3D View | Configure box/pallet, preview patterns, run simulation and AI pipeline |
| AI Analysis | Pipeline progress, Reasoning VLM render gallery, tournament results, explanation |
| Comparison | Side-by-side scoring of all 10 patterns |
Key actions: Render (3D preview) · Run Simulation (shake & tilt test) · Run All + AI (full pipeline: Generate → Tournament → Explain)
uv run pytest tests/ -v # all tests
uv run pytest tests/ -v -m "not slow" # skip MuJoCo-heavy tests
uv run ruff check src/ tests/ scripts/
uv run ruff format src/ tests/ scripts/| Problem | Solution |
|---|---|
uv: command not found |
curl -LsSf https://astral.sh/uv/install.sh | sh && source ~/.bashrc |
requires-python >=3.12 |
uv python install 3.12 (Ubuntu 22.04 default is 3.10) |
401 Unauthorized loading model |
uv run huggingface-cli login + request access at the model page |
ModuleNotFoundError: torch |
uv sync --extra local |
CUDA out of memory |
Use Cosmos-Reason2-2B or set COSMOS_BASE_URL to a remote vLLM server |
| 3D scene is black | Enable hardware acceleration in Chrome/Firefox |
| macOS (Apple Silicon) | Set PYTORCH_ENABLE_MPS_FALLBACK=1 in .env; AI requires Ollama or a remote vLLM server |
| No GPU or HuggingFace access | AI backend auto-falls back to Ollama (localhost:11434). To use a remote server instead, set COSMOS_BASE_URL in .env |
Team Doosan Umanoide — Doosan Robotics
NVIDIA Cosmos Cookoff 2026
Apache 2.0


