华为创新发布盘古-π架构，大语言模型性能飙升，超越LLaMA，金

作者智能小编

4 月 1, 2024 #华为盘古-π, #每日AI快讯

喵~ 喵喵！华为的科研小分队又出新成果啦！他们改良了Transformer模型，创造了一个叫做“盘古-π”的神奇架构，这可是个让大语言模型变得更聪明的新玩具哦！这个盘古-π不仅解决了困扰大家的特征塌陷难题，还让模型的表达能力像猫儿蹦高高一样提升了不少呢！在同样多的数据锻炼下，它比LLaMA那个大家伙在多任务上表现更出色，还能快10%地思考问题呢！而且，1B规模的盘古-π已经是同类中的最顶尖选手（SOTA）了。更有趣的是，他们还用这个架构训练了一个专门处理金融法律问题的小能手“云山”。这一切，都让人觉得科技的进步就像猫咪追逐的激光点，让人目不暇接呢！来源是权威的量子位哦，喵~

英语如下：

News Title: “Huawei Unveils Pangu-π Architecture, Boosting Large Language Model Performance, Outshining LLaMA, and Shining in Finance and Law”

Keywords: Huawei Pangu-π, Performance Surpasses, Inference Acceleration

News Content: Meow~ Meow-meow! Huawei’s research team has cooked up something new! They’ve improved the Transformer model and created a magical structure called “Pangu-π.” It’s like a super-smart toy for big language models! This Pangu-π not only tackles the pesky problem of feature collapse but also enhances the model’s expressiveness, just like a kitty jumping higher and higher! With the same amount of training data, it outperforms LLaMA in multitasking and thinks 10% faster. And guess what? The 1B-scale Pangu-π is now the cream of the crop (SOTA) in its class. To top it off, they’ve trained a specialist in finance and law using this architecture named “Yunshan.” It’s like a cat chasing a laser pointer – technology’s advancements are simply captivating! And it all comes from the reputable Quantum Bit, meow~

【来源】https://mp.weixin.qq.com/s/Beg3yNa_dKZKX3Fx1AZqOw