In recent years, 3D Gaussian splatting has emerged as a powerful technique for 3D reconstruction and generation, known for its fast and high-quality rendering capabilities. To address these shortcomings, this paper introduces a novel diffusion-based framework, GVGEN, designed to efficiently generate 3D Gaussian representations from text input. We propose two innovative techniques:(1) Structured Volumetric Representation. We first arrange disorganized 3D Gaussian points as a structured form GaussianVolume. This transformation allows the capture of intricate texture details within a volume composed of a fixed number of Gaussians. To better optimize the representation of these details, we propose a unique pruning and densifying method named the Candidate Pool Strategy, enhancing detail fidelity through selective optimization. (2) Coarse-to-fine Generation Pipeline. To simplify the generation of GaussianVolume and empower the model to generate instances with detailed 3D geometry, we propose a coarse-to-fine pipeline. It initially constructs a basic geometric structure, followed by the prediction of complete Gaussian attributes. Our framework, GVGEN, demonstrates superior performance in qualitative and quantitative assessments compared to existing 3D generation methods. Simultaneously, it maintains a fast generation speed (∼7 seconds), effectively striking a balance between quality and efficiency.
近年来,3D高斯喷溅作为一种强大的3D重建和生成技术而崭露头角,以其快速和高质量的渲染能力而闻名。为了解决这些不足,本文介绍了一种新颖的基于扩散的框架,GVGEN,旨在高效地从文本输入生成3D高斯表示。我们提出了两种创新技术:(1)结构化体积表示。我们首先将无组织的3D高斯点作为一种结构化形式的GaussianVolume排列。这种转换允许捕捉由固定数量的高斯组成的体积内的复杂纹理细节。为了更好地优化这些细节的表示,我们提出了一种独特的修剪和密集化方法,名为候选池策略,通过选择性优化增强细节保真度。(2)由粗到细的生成管道。为了简化GaussianVolume的生成并使模型能够生成具有详细3D几何形状的实例,我们提出了一种由粗到细的管道。它最初构建一个基本的几何结构,随后预测完整的高斯属性。我们的框架,GVGEN,在定性和定量评估中相比现有的3D生成方法表现出优越的性能。同时,它保持了快速的生成速度(∼7秒),有效地在质量和效率之间找到了平衡。