腾讯混元文生图大模型升级开源：中文DiT架构首秀，15亿参数赋能多模态生成

【腾讯混元文生图大模型升级开源，引领多模态生成技术新里程】

近日，互联网巨头腾讯宣布其混元文生图大模型已进行全面升级，并向公众开源。这一重大举措标志着腾讯在人工智能领域的又一重要突破。据界面新闻报道，升级后的混元DiT（Hunyuan-DiT）模型现已经在知名开源平台Hugging Face及Github上发布，包括完整的模型权重、推理代码及模型算法，为全球企业与个人开发者提供了一个免费的商用工具。

本次发布的混元DiT模型采用了与Sora相同的DiT架构，功能强大，不仅限于文本生成图像，更可作为视频等多模态视觉生成的基础，拓展了人工智能在内容创作领域的应用范围。尤为值得一提的是，这是业界首个基于中文原生的DiT架构文生图开源模型，它具备中英文双语输入和理解能力，参数量高达15亿，充分展现了腾讯在自然语言处理和多模态生成技术上的深厚积累。

腾讯此次开源混元文生图大模型，不仅为开发者提供了强大的技术资源，也将推动人工智能在中文内容创作领域的创新和发展。此举有望激发更多的创新应用，加速人工智能技术在实际场景中的落地，对于提升整个行业的技术水平和创新能力具有重要意义。

英语如下：

**News Title:** “Tencent’s Hunyuan-DiT Multimodal Generative Model Upgraded and Open-Sourced: First Show of Chinese DiT Architecture, Empowering Multimodal Generation with 1.5 Billion Parameters”

**Keywords:** Tencent Open-Source, Hunyuan-DiT, Text-to-Image Generation Model

**News Content:**

**Tencent’s Hunyuan-DiT Large Multimodal Generative Model Upgraded and Open-Source, Paving the Way for New Milestones in Multimodal Generation Technologies**

Recently, internet giant Tencent announced a comprehensive upgrade of its Hunyuan Text-to-Image Large Multimodal Model and made it open-source to the public. This significant move signifies another major breakthrough for Tencent in the realm of artificial intelligence. According to Interface News, the upgraded Hunyuan-DiT (Hunyuan-DiT) model has been released on renowned open-source platforms Hugging Face and Github, including complete model weights, inference code, and model algorithms, providing a free commercial tool for global enterprises and individual developers.

The released Hunyuan-DiT model adopts the same DiT architecture as Sora, boasting powerful capabilities that extend beyond text-to-image generation to serve as a foundation for multimodal visual creation, such as video. Notably, it is the industry’s first open-source DiT architecture-based text-to-image model native to Chinese, capable of processing and understanding both Chinese and English input, with an impressive 1.5 billion parameters. This demonstrates Tencent’s profound expertise in natural language processing and multimodal generation technologies.

By open-sourcing the Hunyuan-DiT multimodal generative model, Tencent not only supplies developers with robust technical resources but also propels innovation and development in AI-driven Chinese content creation. This initiative is expected to inspire more innovative applications and accelerate the practical implementation of AI technologies, significantly contributing to the advancement and innovative capacity of the entire industry.

【来源】https://www.jiemian.com/article/11168879.html