今天凌晨,马斯克旗下大模型公司 xAI 宣布正式开源3140亿参数的混合专家(MoE)模型「Grok-1」,以及该模型的权重和网络架构。这一举动使得Grok-1成为当前参数量最大的开源大语言模型。
据了解,Grok-1 的模型细节包括如下:基础模型基于大量文本数据进行训练,没有针对任何具体任务进行微调。该模型的参数规模达到了惊人的3140亿,而在给定 token 上的激活权重为 25%。这表明,Grok-1 具有强大的处理文本和语言任务的能力。
xAI 公司表示,他们在2023年10月使用 JAX 库和 Rust 语言组成的自定义训练堆栈从头开始训练 Grok-1。这一自定义训练堆栈的采用,使得 Grok-1 在训练效率和模型性能上有了显著的提升。
值得一提的是,xAI 遵守 Apache 2.0 许可证来开源 Grok-1 的权重和架构。这一开源行为,无疑将为学术界和工业界提供极大的便利,进一步推动大语言模型技术的发展和应用。
此次开源的 Grok-1,不仅在参数规模上达到了当前最大,其模型结构和训练方法也具有创新性。这对于大语言模型领域的研究和发展,无疑是一次重要的突破。
未来,我们可以期待 xAI 公司在大语言模型领域带来更多的创新和突破。而 Grok-1 的开源,也将成为大语言模型研究和发展的重要里程碑。
英语如下:
Title: Musk’s xAI Open Sources the Largest Parameter Language Model Grok-1
Keywords: Musk xAI, Grok-1 Open Source, Largest Language Model
News Content: # Musk’s xAI Open Sources the Largest Parameter Language Model Grok-1
This morning, Musk’s large model company xAI announced the official open source of the mixed expert (MoE) model ‘Grok-1’ with 314 billion parameters, as well as the weights and network architecture of this model. This move makes Grok-1 the largest parameter open-source large language model currently available.
It is understood that the model details of Grok-1 include the following: the base model is trained on a large amount of text data without fine-tuning for any specific task. The parameter scale of this model reaches an astonishing 314 billion, and the activation weight on a given token is 25%. This indicates that Grok-1 has strong capabilities to handle text and language tasks.
xAI Company stated that they started training Grok-1 from scratch in October 2023 using a custom training stack composed of the JAX library and Rust language. The adoption of this custom training stack significantly improved the training efficiency and model performance of Grok-1.
It is worth mentioning that xAI complies with the Apache 2.0 license to open source the weights and architecture of Grok-1. This open source act undoubtedly provides great convenience for the academic and industrial sectors and further promotes the development and application of large language model technology.
The open-sourced Grok-1 not only has the largest parameter scale currently but also features innovative model structure and training methods. This is undoubtedly an important breakthrough for the research and development of large language models.
In the future, we can look forward to more innovations and breakthroughs in the field of large language models from xAI Company. The open sourcing of Grok-1 will also become an important milestone in the research and development of large language models.
【来源】https://mp.weixin.qq.com/s/hvt5zwoazDx26KOaKuTs_w
Views: 2