阿里巴巴的通义千问团队近日推出了一款新型的大规模多专家混合模型(MoE)——Qwen1.5-MoE-A2.7B。这款模型在参数规模上进行了大幅优化,仅包含27亿激活参数,但其性能却能与当前最先进的70亿参数模型相媲美,如Mistral 7B和Qwen1.5-7B。与包含65亿Non-Embedding参数的Qwen1.5-7B相比,Qwen1.5-MoE-A2.7B的Non-Embedding参数数量仅为20亿,约为其三分之一。此外,该模型的训练成本相比Qwen1.5-7B降低了75%,而推理速度则提升了1.74倍,这标志着人工智能模型在效率和成本控制方面的重大进步。
英文标题:Title: Alibaba Unveils High-Efficiency MoE Model Qwen1.5-MoE-A2.7B
英文关键词:Keywords: Efficient Model, Parameter Optimization, Cost Reduction
英文新闻内容: News content: Alibaba’s Tongyi Qianwen team has recently unveiled a new large-scale MoE (Mixture of Experts) model, Qwen1.5-MoE-A2.7B. This model features significant optimization in parameter size, with only 2.7 billion activated parameters, yet it performs on par with the most advanced models of 7 billion parameters, such as Mistral 7B and Qwen1.5-7B. Compared to Qwen1.5-7B, which has 6.5 billion Non-Embedding parameters, Qwen1.5-MoE-A2.7B has only 2 billion, about a third of that. Furthermore, the training cost of the new model is 75% lower than Qwen1.5-7B, while its inference speed is 1.74 times faster, marking a significant advancement in the efficiency and cost control of AI models.
【来源】https://mp.weixin.qq.com/s/6jd0t9zH-OGHE9N7sut1rg
Views: 1