北大团队发布3D场景生成新框架：对话式编辑与高质量生成

新闻正文：
近日，北京大学王选计算机研究所的 VDIG 实验室在复杂三维场景生成领域取得重大突破，发布了名为 GALA3D 的最新研究成果。这项技术采用大型语言模型（LLMs）引导的生成式方法，能够实现高质量、高一致性的多物体三维场景生成，并支持对话式交互的可控编辑。

GALA3D 框架的核心在于利用 LLMs 生成初始布局，并通过布局引导的生成式 3D 高斯表示构建复杂的三维场景。该框架设计了一种自适应几何控制机制，优化 3D 高斯的形状和分布，确保生成的场景具有一致的纹理、比例和精确的物体交互。此外，GALA3D 还提出了一种组合优化机制，结合条件扩散先验和文生图模型，协作生成具有一致风格的 3D 多物体场景。

GALA3D 的发布标志着在文本到复杂三维场景生成方面取得了显著进展，超越了现有的文生 3D 场景方法。该技术不仅能够零-shot 生成相应的三维场景，还支持用户友好的端到端生成和可控编辑，使得普通用户能够在对话式的交谈中轻松定制和编辑 3D 场景。

VDIG 实验室的这项研究成果已在国际机器学习大会（ICML）上被录用，并在 AIxiv 专栏上进行了报道。GALA3D 的成功不仅体现了北京大学在人工智能领域的深厚研究实力，也为三维场景生成与编辑领域带来了新的希望和可能性。未来，随着技术的不断进步，GALA3D 有望在数字孪生、虚拟现实、游戏设计等多个领域发挥重要作用。

英语如下：

News Title: Peking University Team Unveils New Framework for 3D Scene Generation: Dialogue-Driven Editing and High-Quality Production

Keywords: 3D Generation, LLM Control, Complex Scenes

News Content:
Title: Peking University’s Wang Xuan Computer Research Institute Releases Latest Research Achievement GALA3D, Breaking Through Complex 3D Scene Generation and Controllable Editing Technology

News Article:
Recently, the VDIG Laboratory of Peking University’s Wang Xuan Computer Research Institute made a significant breakthrough in the field of complex 3D scene generation and released the latest research achievement, GALA3D. This technology utilizes a generative method guided by large language models (LLMs) to achieve high-quality and consistent generation of multi-object 3D scenes, supported by dialogue-driven controllable editing.

The core of the GALA3D framework lies in using LLMs to generate an initial layout and construct complex 3D scenes through layout-guided 3D Gaussian representations. The framework designs an adaptive geometric control mechanism to optimize the shape and distribution of 3D Gaussians, ensuring that the generated scenes have consistent textures, proportions, and precise object interactions. Furthermore, GALA3D proposes a combinatorial optimization mechanism that combines conditional diffusion priors and text-to-image models to collaboratively generate 3D multi-object scenes with consistent styles.

The release of GALA3D marks a significant advancement in the generation of complex 3D scenes from text, surpassing existing text-to-3D scene methods. This technology not only enables zero-shot generation of corresponding 3D scenes but also supports user-friendly end-to-end generation and controllable editing, making it easy for ordinary users to customize and edit 3D scenes through dialogue interactions.

The research achievement of the VDIG Laboratory has been accepted at the International Machine Learning Conference (ICML) and reported on the AIxiv column. The success of GALA3D not only reflects Peking University’s profound research strength in the field of artificial intelligence but also brings new hopes and possibilities to the field of 3D scene generation and editing. In the future, as technology continues to advance, GALA3D is expected to play a significant role in various fields such as digital twins, virtual reality, and game design.

【来源】https://www.jiqizhixin.com/articles/2024-07-31-4