Lumina-DiMOO (Fork)

This repository is a fork of the original project. For the full research documentation, results, and updates, please refer to the upstream repository:

Original repo: https://github.com/Alpha-VLLM/Lumina-DiMOO
Model on Hugging Face: https://huggingface.co/Alpha-VLLM/Lumina-DiMOO

Below are streamlined, reproducible instructions to install and run the demo locally.

Quick Start (Windows, uv)

Copy-paste these commands in PowerShell from the repo root.

# 1) Install uv (once)
python -m pip install --user uv

# 2) Create the environment from pyproject.toml (non-Torch deps)
uv sync

# 3) Install PyTorch with the correct CUDA wheels (auto-detects CUDA, or add --tag cpu)
uv run --no-sync python scripts/install_torch.py --install-vcredist

# 4) Launch the Gradio UI (defaults to Alpha-VLLM/Lumina-DiMOO, downloads on first use)
uv run --no-sync python -m ui.gradio_app

Notes:

If you have CUDA 12.x/11.x and want to force a tag: add --tag cu121 or --tag cu118 to the Torch installer.
The UI has a “Preload / Download Models” button; you can click it or just run a task to trigger the first download.
Some models may require Hugging Face auth. If so, run huggingface-cli login once before launching the UI.

Requirements

Windows 10/11 (64-bit)
Python 3.10 or 3.11
NVIDIA GPU with recent driver for CUDA usage (optional; CPU is supported but slow)
Internet access (to download model weights from Hugging Face on first run)

Install with uv (recommended)

uv provides fast, reproducible environments from pyproject.toml.

Install uv if not present

python -m pip install --user uv

Sync environment (non-Torch deps)

uv sync

Install PyTorch with CUDA-aware wheels

uv run --no-sync python scripts/install_torch.py

Notes:

Auto-detects CUDA. Override with --tag cu121 (CUDA 12.x), --tag cu118 (CUDA 11.x), or --tag cpu if needed.
If MSVC runtime is missing, add --install-vcredist.

Optional extras (after Torch):

uv sync --group torch_ext

Run the Gradio demo

Launch the UI (uses defaults Alpha-VLLM/Lumina-DiMOO):

uv run --no-sync python -m ui.gradio_app

In the app:

Click “Preload / Download Models” to fetch weights from Hugging Face, or
Just run a task; the first request will download automatically and cache locally.

You can change the checkpoint or VAE path in the text fields before preloading/running.

Back-compat entrypoint:

uv run --no-sync python scripts/gradio_app.py

Run from CLI (optional)

Text-to-Image example:

uv run --no-sync python scripts/inference_t2i.py `
  --checkpoint Alpha-VLLM/Lumina-DiMOO `
  --prompt "A vivid painting of a serene lake under the moonlight" `
  --height 768 --width 1536 --timesteps 64 --cfg_scale 4.0 `
  --vae_ckpt Alpha-VLLM/Lumina-DiMOO `
  --output_dir output/results_text_to_image

Image-to-Image example:

uv run --no-sync python scripts/inference_i2i.py `
  --checkpoint Alpha-VLLM/Lumina-DiMOO `
  --prompt "Generate a canny edge map according to the image" `
  --image_path examples/example_1.png `
  --edit_type canny_pred --timesteps 64 `
  --cfg_scale 2.5 --cfg_img 4.0 `
  --vae_ckpt Alpha-VLLM/Lumina-DiMOO `
  --output_dir output/results_image_to_image

Understanding example:

uv run --no-sync python scripts/inference_mmu.py `
  --checkpoint Alpha-VLLM/Lumina-DiMOO `
  --prompt "Please describe this image." `
  --image_path examples/example_6.jpg `
  --steps 128 --gen_length 128 --block_length 32 `
  --vae_ckpt Alpha-VLLM/Lumina-DiMOO `
  --output_dir output/outputs_text_understanding

Troubleshooting

If CUDA isn’t detected, ensure your NVIDIA driver is installed and recent. Then rerun scripts/install_torch.py.
For Windows MSVC runtime issues, run with --install-vcredist.
To force a specific wheel set: --tag cu121 (CUDA 12.x), --tag cu118 (CUDA 11.x), --tag cpu.
If a model requires authentication, run huggingface-cli login before launching.

License

This fork inherits the original project’s license. See LICENSE in this repository and the upstream repository for details.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
examples		examples
generators		generators
model		model
scripts		scripts
ui		ui
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lumina-DiMOO (Fork)

Quick Start (Windows, uv)

Requirements

Install with uv (recommended)

Run the Gradio demo

Run from CLI (optional)

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Lumina-DiMOO (Fork)

Quick Start (Windows, uv)

Requirements

Install with uv (recommended)

Run the Gradio demo

Run from CLI (optional)

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages