国内AI企业发布自研非Transformer Attention

作者智能小编

2 月 13, 2024 #AI大模型, #岩芯数智, #每日AI快讯

上海岩芯数智人工智能科技有限公司（岩芯数智，RockAI）近日在上海发布了一款名为Yan的通用自然语言大模型，该模型采用国内首个自研的非Transformer Attention机制，其记忆能力较传统模型提升3倍，速度提升7倍，推理吞吐量提升5倍。据该公司介绍，Yan模型的参数规模达百亿，能实现与ChatGPT不同机制的通用大模型的性能效果。

Yan模型的发布在国内AI领域具有重大意义。一直以来，Transformer Attention机制被认为是自然语言处理领域的黄金标准，但岩芯数智公司的这一突破性成果表明，非Transformer Attention机制同样具有巨大的潜力和应用价值。

据悉，Yan模型在参数规模达到百亿的情况下，能够实现千亿参数大模型的性能效果，这无疑为我国AI领域的发展注入了新的活力。这款模型的推出，不仅标志着我国在AI大模型领域的研究取得了重要进展，也为AI行业的未来发展提供了更多可能性。

英文标题：Domestic AI Company Unveils Self-developed Non-Transformer Attention Large Model Yan
关键词：AI large model，non-Transformer Attention，RockAI

英文新闻内容：
Shanghai RockAI, a domestic AI company, recently released a general-purpose natural language large model named Yan in Shanghai. This model is the first in China to adopt a self-developed non-Transformer Attention mechanism, which enhances memory capabilities by 3 times, speed by 7 times, and inference throughput by 5 times compared to traditional models. The company claims that Yan model can achieve the performance of a 100-billion parameter large model with the effect of a 1000-billion parameter large model.

The release of Yan model has significant implications for the field of AI in China. Transformer Attention mechanism has long been considered the golden standard in natural language processing, but this breakthrough by RockAI demonstrates the immense potential and application value of non-Transformer Attention mechanisms.

It is reported that Yan model can achieve the performance of a 1000-billion parameter large model with its parameter scale reaching 100 billion, injecting new vitality into the development of China’s AI industry. The introduction of this model not only signifies an important progress in the research of AI large models in China but also provides more possibilities for the future development of the AI industry.

【来源】https://www.tmtpost.com/6898099.html