首创新技术：FBI-LLM全二值化大语言模型从零高效训练

在人工智能领域持续的创新浪潮中，MBZUAI（阿联酋第一所人工智能大学）与美国卡内基梅隆大学（CMU）计算机系合作，发表了一篇突破性的论文《FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation》。这篇论文介绍了一种名为FBI-LLM（全二值化低比特基础大语言模型）的新型语言模型，它采用从头开始的全二值化训练方式，旨在解决大语言模型（LLMs）在存储和计算需求方面的挑战，同时保持高性能。

FBI-LLM通过采用自回归蒸馏的优化方式，从零开始训练全二值化的LLMs，该模型能够与使用浮点16（FP16）或半精度浮点16（BF16）训练的LLMs相媲美或接近其性能，且在效果上显著优于之前所有二值化LLMs。这一创新不仅实现了模型参数的极致压缩，还显著提升了模型的推理速度和能源效率。

### 背景与挑战

随着大语言模型在特定领域知识生成和复杂推理任务中的出色表现，其参数规模的扩大带来了巨大的存储和计算需求，限制了模型的广泛应用。量化技术，如二值化，作为一种有效的压缩方法，通过减少参数位数来降低存储需求和提高计算速度，但同时也牺牲了一定的准确性。先前的研究在保持模型性能的同时，尝试通过保留关键参数、使用接近二值化的表示方式或引入额外的全精度参数来优化这一过程，但仍有改进空间。

### 创新与贡献

FBI-LLM的主要创新在于其从头开始的全二值化训练策略，以及通过自回归蒸馏优化方法，实现了模型参数的高效压缩和训练。这种方法避免了依赖预训练模型参数带来的知识损失，同时提供了更大的模型结构灵活性，允许在不同的参数规模和词汇表大小下进行调整，从而更广泛地适应实际应用需求。此外，FBI-LLM的开源代码、数据和模型权重为研究人员和开发者提供了宝贵的资源，加速了该领域的发展。

### 结论与展望

FBI-LLM的发布标志着全二值化LLMs领域的一个重要里程碑，为构建更高效、更节能的语言模型提供了新的可能性。随着该模型在实际应用中的进一步验证和优化，预计将进一步推动人工智能技术在计算资源受限环境中的应用，促进人工智能的普及和深入发展。

英语如下：

### Breaking Innovation: FBI-LLM Fully Binary Large Language Model Trained from Scratch for High Efficiency

In the continuous wave of innovation in artificial intelligence, MBZUAI (the first AI university in the UAE) in collaboration with Carnegie Mellon University’s Computer Science Department has published a groundbreaking paper titled ‘FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation’. This paper introduces a new type of language model, FBI-LLM (Fully Binarized Low-bit Foundation Large Language Model), which employs a ground-up binary training approach. The aim is to address the challenges of large language models (LLMs) in terms of storage and computational requirements while maintaining high performance.

FBI-LLM utilizes an optimized autoregressive distillation method to train fully binary LLMs from the ground up, enabling it to match or come close to the performance of LLMs trained with half-precision floating-point (FP16) or binary floating-point (BF16), while outperforming all previous binary LLMs in terms of efficacy. This innovation not only achieves the extreme compression of model parameters but also significantly enhances the model’s inference speed and energy efficiency.

### Background and Challenges

As large language models excel in tasks like generating specific domain knowledge and complex reasoning, the expansion of their parameter sizes has led to substantial demands in storage and computation, limiting their widespread application. Quantization techniques, such as binarization, serve as an effective compression method by reducing the bit size of parameters to lower storage requirements and speed up calculations. However, this process also entails a trade-off in terms of accuracy. Previous research has attempted to optimize this process by retaining critical parameters, using near-binary representations, or introducing additional full-precision parameters, but there remains room for improvement.

### Innovation and Contribution

The main innovation of FBI-LLM lies in its ground-up binary training strategy and the use of autoregressive distillation optimization methods, which achieve efficient compression and training of model parameters. This approach avoids the knowledge loss associated with relying on pre-trained model parameters and provides greater flexibility in model architecture, allowing for adjustments across different parameter scales and vocabulary sizes to better meet practical application needs. Moreover, the availability of open-source code, data, and model weights for FBI-LLM accelerates advancements in the field for researchers and developers.

### Conclusion and Outlook

The release of FBI-LLM marks a significant milestone in the field of fully binary LLMs, offering new possibilities for constructing more efficient and energy-saving language models. As the model undergoes further validation and optimization in practical applications, it is expected to further facilitate the application of AI technologies in environments with limited computational resources, driving the expansion and deepening of AI development.

【来源】https://www.jiqizhixin.com/articles/2024-07-28-4