A pytorch implementation of “X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation”
blenderV2.mp4
intro.MP4
- System requirement: Ubuntu20.04
- Tested GPUs: RTX3090
- Environment Installation
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
# Geometry modeling
python -m torch.distributed.launch --nproc_per_node=4 \
train_x_dreamer.py \
--config configs/cupcake_geometry.json \
--out-dir 'results/result_XDreamer/cupcake_geometry'
# Geometry modeling
python -m torch.distributed.launch --nproc_per_node=4 \
train_x_dreamer.py \
--config configs/cupcake_appearance.json \
--out-dir 'results/result_XDreamer/cupcake_appearance' \
--base-mesh 'results/result_XDreamer/cupcake_geometry/dmtet_mesh/mesh.obj'
# Geometry modeling
python -m torch.distributed.launch --nproc_per_node=4 \
train_x_dreamer.py \
--config configs/Batman_geometry.json \
--out-dir 'results/result_XDreamer/Batman_geometry'
# Geometry modeling
python -m torch.distributed.launch --nproc_per_node=4 \
train_x_dreamer.py \
--config configs/Batman_appearance.json \
--out-dir 'results/result_XDreamer/Batman_appearance' \
--base-mesh 'results/result_XDreamer/Batman_geometry/dmtet_mesh/mesh.obj'
Overview of the proposed X-Dreamer, which consists of two main stages: geometry learning and appearance learning.For the geometry learning stage, we employ DMTET as the 3D representation and initialize it with a 3D ellipsoid using the mean squared error (MSE) loss. Subsequently, we optimize DMTET and CG-LoRA using the score distillation sampling (SDS) loss and our proposed attention-mask alignment (AMA) loss to ensure the alignment between the 3D representation and the input text prompt. For the appearance learning, we leverage bidirectional reflectance distribution function (BRDF) modeling. Specifically, we utilize an MLP with trainable parameters to predict surface materials. Similar to the geometry learning stage, we optimize the MLP and CG-LoRA using the SDS loss and the AMA loss to achieve alignment between the 3D representation and the input text prompt.
- 2023.11.27: Create Repository
- 2023.12.28: Release Code
result.mp4
We conduct the experiments using four Nvidia RTX 3090 GPUs and the PyTorch library. To calculate the SDS loss, we utilize the Stable Diffusion implemented by Hugging Face Diffusers. For the DMTET and material encoder, we implement them as a two-layer MLP and a single-layer MLP, respectively, with a hidden dimension of 32. We optimize X-Dreamer for 2000 iterations for geometry learning and 1000 iterations for appearance learning.
We present representative results of X-Dreamer for text-to-3D generation, utilizing an ellipsoid as the initial geometry.
X-Dreamer also supports text-based mesh geometry editing and is capable of delivering excellent results.
Coarse-grained Mesh | Image | Normal |
---|---|---|
A beautifully carved wooden queen chess piece. | A beautifully carved wooden queen chess piece. | |
Barack Obama's head. | Barack Obama's head. |
We demonstrate how swapping the HDR environment map results in diverse lighting, thereby creating various reflective effects on the generated 3D assets in X-Dreamer.
We demonstrate the editing process of the geometry and appearance of 3D assets in X-Dreamer using an ellipsoid and coarse-grained guided meshes as geometric shapes for initialization, respectively.
From an ellipsoid | From coarse-grained guided meshes | ||
A DSLR photo of a blue and white porcelain vase, highly detailed, 8K, HD. | A marble bust of an angel, 3D model, high resolution. | ||
A stack of pancakes covered in maple syrup. | A DSLR photo of the Terracotta Army, 3D model, high resolution. |
We compared X-Dreamer with four state-of-the-art (SOTA) methods: DreamFusion, Magic3D, Fantasia3D, and ProlificDreamer. The results are shown below:
blender.mp4
@article{ma2023xdreamer,
title={X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation},
author={Ma, Yiwei and Fan, Yijun and Ji, Jiayi and Wang, Haowei and Sun, Xiaoshuai and Jiang, Guannan and Shu, Annan and Ji, Rongrong},
journal={arXiv preprint arXiv:2312.00085},
year={2023}
}