Skip to content

Request for Pretrained Checkpoints & Inference Pipeline Details #2

@SatoshiZKMO

Description

@SatoshiZKMO

Hello, and thank you for sharing this impressive work on Dexterous World Model (DWM).
The idea of action-conditioned video diffusion for interactive digital twins is very exciting.

I have a few questions and a small feature request that I believe could benefit the community:

Pretrained Models

Are there plans to release pretrained checkpoints for DWM?

If so, will they include models trained on both the synthetic egocentric interaction data and real-world fixed-camera videos?

Inference Pipeline

Could you clarify the exact inference pipeline for generating interaction videos from:

a static 3D scene rendering sequence (camera trajectory), and

an egocentric hand motion / mesh sequence?

Any example scripts or configuration files for end-to-end inference would be very helpful.

Scene Generalization

Have you evaluated how well DWM generalizes to unseen 3D scenes or object layouts at inference time?

Are there recommended constraints on scene scale, object categories, or camera trajectories?

Future Extensions

Do you see DWM being extended to:

multi-hand or bimanual interactions,

non-rigid objects,

or physics-aware feedback loops?

I believe releasing pretrained models or a minimal demo would significantly accelerate adoption and follow-up research on interactive digital twins and embodied simulation.

Thanks again for the great work, and congratulations on the project!

Best regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions