OpenNLPLab团队近日发布了新一代注意力机制Lightning Attention-2,旨在解决大语言模型在处理长序列数据时遇到的难题。这一新型线性注意力机制能够在不增加训练和推理成本的情况下,处理无限长的序列数据。这不仅使得无限长度预训练成为可能,而且极大降低了当前大语言模型的推理成本。在显存允许的范围内,序列长度的无限增大不会对模型训练速度产生负面影响,这意味着使用Lightning Attention-2进行超长文本推理的成本将等同于处理1K Tokens的成本,甚至更低。这一突破性的技术由OpenNLPLab团队开发并开源,有望在大语言模型领域引起变革。

Title: Lightning Attention 2: A Revolution in Long Sequence Language Models
Keywords: Language Models, Long Sequences, Cost Efficiency
News content:
The OpenNLPLab team has recently released the next-generation attention mechanism, Lightning Attention-2, aimed at solving the challenge of handling long sequences in large language models. This new linear attention mechanism allows for the processing of infinite sequence lengths without increasing training and inference costs. It enables the possibility of infinite length pre-training and significantly reduces the inference costs of current large language models. Within the limits of available memory, the unlimited increase in sequence length does not adversely affect the speed of model training. This breakthrough technology, developed and open-sourced by the OpenNLPLab team, is set to cause a transformation in the field of large language models.

【来源】https://www.jiqizhixin.com/articles/2024-01-18-5

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注