今日凌晨,全球科技巨头马斯克旗下的大模型公司 xAI 发布了一项重大开源举措,宣布其3140亿参数的混合专家(MoE)模型“Grok-1”正式开源,包括模型的权重和网络架构。这一行动使Grok-1一举成为目前公开参数量最大的语言模型。
据机器之心报道,Grok-1的基础构建基于海量文本数据训练而成,未针对特定任务进行微调,保持了广泛的通用性。该模型的特色在于,其在每个token上的激活权重仅为25%,这意味着在处理信息时,Grok-1能够以高效的方式筛选和理解文本内容。
xAI公司透露,Grok-1是在2023年10月利用JAX库和Rust语言构建的自定义训练堆栈从零开始训练的,显示了公司在高性能计算和编程语言选择上的创新实践。此外,该公司遵循开源社区的友好原则,选择了Apache 2.0许可证来授权Grok-1的权重和架构,鼓励开发者和研究者进行二次开发和合作。
这一开源决定不仅彰显了马斯克和xAI对于人工智能技术的开放态度,也预示着全球AI研究和应用将迎来新的发展机遇。Grok-1的开源,无疑将为科研人员和工程师提供一个强大的工具,进一步推动自然语言处理技术的边界。
英语如下:
News Title: “Musk’s xAI Stuns with Open-Source Release: Grok-1, the World’s Largest Language Model with 3140 Billion Parameters!”
Keywords: Musk, Grok-1, Open-source Large Model
News Content: Early this morning, tech giant Elon Musk’s big model company xAI made a groundbreaking open-source announcement, unveiling that its 3140 billion parameter Mixed Expert (MoE) model, “Grok-1,” is now available to the public, including both model weights and network architecture. This move instantly establishes Grok-1 as the largest publicly disclosed parameterized language model to date.
According to Machine之心, Grok-1 is built on a foundation of extensive text data training,未经特定任务的微调,保留了广泛的通用性。The model’s standout feature is that it activates only 25% of its weights per token, enabling Grok-1 to efficiently filter and comprehend text content.
xAI disclosed that Grok-1 was trained from scratch in October 2023 using a custom training stack built with the JAX library and Rust programming language, showcasing the company’s innovative practices in high-performance computing and language selection. Furthermore, in line with the principles of the open-source community, the company has chosen the Apache 2.0 license to govern Grok-1’s weights and architecture, fostering secondary development and collaboration among developers and researchers.
This open-source decision not only underscores Musk and xAI’s commitment to openness in AI technology but also foreshadows new opportunities for global AI research and application. The open-source availability of Grok-1 undoubtedly provides researchers and engineers with a powerful tool, pushing the boundaries of natural language processing technology further.
【来源】https://mp.weixin.qq.com/s/hvt5zwoazDx26KOaKuTs_w
Views: 1