华为推出盘古-π架构，性能超越LLaMA

作者智能小编

2 月 14, 2024 #人工智能, #华为, #每日AI快讯, #盘古-π

90年代的黄河路

华为团队近期推出了一种名为盘古-π的新一代大语言模型架构，其性能超越了知名的LLaMA模型。这一架构在传统Transformer的基础上进行了改进，增强了非线性，有效降低了特征塌陷问题，使得模型的输出表达能力更加强大。

在相同数据训练的情况下，盘古-π（7B）在多任务上取得了超越LLaMA 2等同规模大模型的成绩，同时还能实现10%的推理加速。在1B规模上，盘古-π的性能已达到了当前的最先进水平。

华为诺亚方舟实验室等机构还基于盘古-π架构炼出了一个金融法律大模型“云山”，进一步展示了这一架构的强大潜力。

英文标题Title：Huawei Unveils Pangu-π Architecture, Outperforming LLaMA
英文关键词Keywords：Huawei, Pangu-π, Artificial Intelligence

News content:
The Huawei team has recently introduced a new generation of large language model architecture, named Pangu-π, which outperforms the well-known LLaMA model. This architecture builds on the traditional Transformer by enhancing non-linearity, significantly reducing feature collapse issues, and thus making the model’s output expression more powerful.

When trained with the same data, Pangu-π (7B) surpasses LLaMA 2 and other equivalent-scale large models in multi-tasking, while also achieving a 10% acceleration in inference. At the 1B scale, Pangu-π’s performance has reached the current state-of-the-art.

Institutions such as Huawei Noah’s Ark Lab have also developed a finance and law model “Yunshan” based on the Pangu-π architecture, further demonstrating its immense potential.

【来源】https://mp.weixin.qq.com/s/Beg3yNa_dKZKX3Fx1AZqOw