90年代的黄河路

华为团队近期推出了一种名为盘古-π的新一代大语言模型架构,其性能超越了知名的LLaMA模型。这一架构在传统Transformer的基础上进行了改进,增强了非线性,有效降低了特征塌陷问题,使得模型的输出表达能力更加强大。

在相同数据训练的情况下,盘古-π(7B)在多任务上取得了超越LLaMA 2等同规模大模型的成绩,同时还能实现10%的推理加速。在1B规模上,盘古-π的性能已达到了当前的最先进水平。

华为诺亚方舟实验室等机构还基于盘古-π架构炼出了一个金融法律大模型“云山”,进一步展示了这一架构的强大潜力。

英文标题Title:Huawei Unveils Pangu-π Architecture, Outperforming LLaMA
英文关键词Keywords:Huawei, Pangu-π, Artificial Intelligence

News content:
The Huawei team has recently introduced a new generation of large language model architecture, named Pangu-π, which outperforms the well-known LLaMA model. This architecture builds on the traditional Transformer by enhancing non-linearity, significantly reducing feature collapse issues, and thus making the model’s output expression more powerful.

When trained with the same data, Pangu-π (7B) surpasses LLaMA 2 and other equivalent-scale large models in multi-tasking, while also achieving a 10% acceleration in inference. At the 1B scale, Pangu-π’s performance has reached the current state-of-the-art.

Institutions such as Huawei Noah’s Ark Lab have also developed a finance and law model “Yunshan” based on the Pangu-π architecture, further demonstrating its immense potential.

【来源】https://mp.weixin.qq.com/s/Beg3yNa_dKZKX3Fx1AZqOw

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注