腾讯混元文生图模型升级：开源双语DiT架构，支持视频生成

近日，腾讯宣布旗下混元文生图大模型（Hunyuan-DiT）完成升级，并正式对外开源。该模型现已在Hugging Face平台及Github上发布，包含模型权重、推理代码、模型算法等完整内容，面向企业与个人开发者免费商用。

据了解，升级后的混元文生图大模型采用了与Sora一致的DiT（Dual Vision Transformer）架构。这一架构不仅支持文本生成图像（文生图），还能作为视频等多模态视觉生成的基础，展现了其在视觉生成领域的广泛应用潜力。

值得一提的是，混元文生图大模型是业内首个中文原生的DiT架构文生图开源模型，具有15亿参数量。该模型支持中英文双语输入及理解，能更精准地把握和生成文本描述所对应的高质量图像，为人工智能领域带来突破性的进展。

开源平台的发布意味着更多的开发者和企业可以轻松接入和使用这一模型，推动创新应用的发展。未来，随着开源社区的共同优化和升级，混元文生图大模型还有望实现更加强大的性能和更广泛的应用场景。

腾讯此次混元文生图大模型的升级和开源，进一步展示了我国在人工智能领域的技术实力和创新能力，将为各行各业的智能化转型提供强有力的支持。

英语如下：

Certainly, here is the translation in English using Markdown format:

“`markdown
# Tencent Releases Upgraded Hunyuan-DiT Text-to-Image Model: Open-sourced Bilingual DiT Architecture, Supports Video Generation

**Keywords:** Tencent Hunyuan, Text-to-Image Model, Open Source Release

### Tencent Announces Upgrade of Hunyuan-DiT Text-to-Image Model and Open Sources It

Recently, Tencent has announced the upgrade of its Hunyuan-DiT text-to-image large model and has officially open-sourced it. The model is now available on the Hugging Face platform and GitHub, containing complete content such as model weights, inference code, and model algorithms, and is free for commercial use by both enterprises and individual developers.

It is understood that the upgraded Hunyuan-DiT text-to-image large model adopts the DiT (Dual Vision Transformer) architecture consistent with Sora. This architecture not only supports text-to-image generation but also serves as a foundation for multi-modal visual generation such as videos, showcasing its broad application potential in the field of visual generation.

It is worth noting that the Hunyuan-DiT text-to-image large model is the industry’s first Chinese-native DiT architecture text-to-image open-source model with 1.5 billion parameters. The model supports bilingual input and understanding in both Chinese and English, allowing for more precise grasping and generation of high-quality images corresponding to textual descriptions, bringing breakthrough progress to the field of artificial intelligence.

The release on the open-source platform means that more developers and enterprises can easily access and use this model, driving the development of innovative applications. In the future, with the joint optimization and upgrading of the open-source community, the Hunyuan-DiT text-to-image large model is expected to achieve even stronger performance and a wider range of application scenarios.

Tencent’s upgrade and open sourcing of the Hunyuan-DiT text-to-image large model further demonstrate China’s technical strength and innovative capabilities in the field of artificial intelligence, and will provide strong support for the intelligent transformation of all walks of life.
“`

This translation maintains the structured format of the original text and conveys the technical details of the model upgrade and its open-sourcing by Tencent.

【来源】https://www.jiemian.com/article/11168879.html