【北京,中国】阿里通义千问团队近日宣布推出Qwen系列的最新成果——Qwen1.5-MoE-A2.7B,这是该团队打造的首个大规模并行MoE(Mixture of Experts)模型。尽管Qwen1.5-MoE-A2.7B的激活参数数量仅为27亿,远少于当前的顶级模型,如拥有70亿参数的Mistral 7B和Qwen1.5-7B,但在性能上,它却展现出与这些大型模型相匹敌的能力。
据透露,Qwen1.5-MoE-A2.7B的Non-Embedding参数数量仅为20亿,相较于Qwen1.5-7B的65亿个Non-Embedding参数,新模型的参数规模缩减至原模型的约三分之一,实现了高效能与轻量化之间的平衡。这一创新设计不仅减少了模型的存储需求,还显著降低了训练成本,据称比Qwen1.5-7B的训练成本减少了75%。
此外,Qwen1.5-MoE-A2.7B在推理速度上表现出色,相比Qwen1.5-7B,其速度提升了1.74倍,这意味着在实际应用中,用户可以更快地获得准确的响应,提升了用户体验。这一进步对于推动人工智能在各领域的广泛应用具有重要意义。
这一消息在魔搭社区发布后,立即引起了业界的广泛关注。阿里通义千问团队的这一创新,无疑为大规模预训练模型的发展开辟了新的道路,为未来AI技术的优化和普及奠定了坚实基础。
英语如下:
**News Title:** “Alibaba Qwen Releases Qwen1.5-MoE-A2.7B: A High-Performance Large Model Challenging 7-billion-Parameter Giants”
**Keywords:** Qwen1.5-MoE-A2.7B, Performance Boost, Cost Reduction
**News Content:**
**Beijing, China** — The Qwen team at Alibaba recently announced the launch of their latest innovation, Qwen1.5-MoE-A2.7B, the team’s first large-scale parallel MoE (Mixture of Experts) model. Despite having only 2.7 billion active parameters, significantly fewer than leading models like the 7-billion-parameter Mistral 7B and Qwen1.5-7B, Qwen1.5-MoE-A2.7B demonstrates comparable performance to these大型 models.
Sources indicate that Qwen1.5-MoE-A2.7B has 2 billion Non-Embedding parameters, a reduction to approximately one-third of Qwen1.5-7B’s 6.5 billion Non-Embedding parameters. This downsizing achieves a balance between high performance and lightweight architecture. This innovative design not only decreases storage requirements but also significantly lowers training costs, reportedly by 75% compared to Qwen1.5-7B.
Furthermore, Qwen1.5-MoE-A2.7B excels in inference speed, delivering a 1.74 times improvement over Qwen1.5-7B. This enhancement means users can obtain accurate responses more swiftly, enhancing the user experience. This progress holds significant implications for the widespread adoption of AI across various sectors.
The announcement in the ModelScope community has promptly garnered widespread attention in the industry. Alibaba Qwen’s innovation paves a new path for the development of large pre-training models, laying a robust foundation for the optimization and popularization of AI technologies in the future.
【来源】https://mp.weixin.qq.com/s/6jd0t9zH-OGHE9N7sut1rg
Views: 1