OpenNLPLab团队近日发布了一项名为Lightning Attention-2的新型注意力机制。该机制旨在一劳永逸地解决大语言模型在处理长序列时遇到的问题。通过提出并开源这一新型线性注意力机制,团队成功让长序列的训练和推理成本与1K序列长度的一致。
这一突破意味着,在显存充足的情况下,无限地增大序列长度不会对模型训练速度产生负面影响。这一特性使得无限长度预训练成为可能,为语言模型的发展带来了新的可能性。
同时,超长文本的推理成本也与1K Tokens的成本一致甚至更少。这一特性将极大地降低当前大语言模型的推理成本,为长序列处理带来了高效且经济的解决方案。
Title: New Generation Attention Mechanism Achieves Efficient Long Sequence Processing
Keywords: Attention Mechanism, Long Sequence Processing, Efficient Training
News content:
The OpenNLPLab team has recently released a new attention mechanism called Lightning Attention-2. This mechanism aims to solve the long-standing problem of large language models handling long sequences once and for all. By proposing and open-sourcing this new linear attention mechanism, the team has successfully made the training and inference costs of long sequences consistent with those of 1K sequence lengths.
This breakthrough means that, given sufficient memory, there will be no negative impact on the training speed of the model when the sequence length is increased indefinitely. This feature makes无限长度预训练 possible, bringing new possibilities to the development of language models.
Meanwhile, the inference cost of handling ultra-long texts is also consistent with or even less than the cost of 1K Tokens. This attribute will greatly reduce the inference costs of current large language models, providing an efficient and cost-effective solution for long sequence processing.
【来源】https://www.jiqizhixin.com/articles/2024-01-18-5
Views: 1