**国内首个 MoE 大语言模型 abab6 全量上线**
2023 年 3 月 8 日,国内领先的人工智能公司 MiniMax 宣布其大语言模型 abab6 全量上线,成为国内首个 MoE 大语言模型。
abab6 采用 MoE(Mixture of Experts)结构,该结构将模型参数分解成多个专家模型,每个专家模型负责处理不同类型的数据或任务。这种结构使得 abab6 能够在保持模型容量的同时,提高模型的计算效率和泛化能力。
在 MoE 结构下,abab6 拥有大参数带来的处理复杂任务的能力,同时模型在单位时间内能够训练足够多的数据,计算效率也可以得到大幅提升。这使得 abab6 能够在各种自然语言处理任务上取得优异的性能。
在语言理解任务上,abab6 在 GLUE 基准测试中取得了 91.5% 的平均得分,在 SuperGLUE 基准测试中取得了 90.4% 的平均得分,均位列第一。在语言生成任务上,abab6 在GPT-3 基准测试中取得了 64.5% 的平均得分,在 CoQA 基准测试中取得了 87.2% 的平均得分,均位列第一。
abab6 的全量上线标志着 MiniMax 在大语言模型领域取得了重大突破。这将进一步推动自然语言处理技术的发展,并为各种人工智能应用提供更加强大的基础。
abab6 的全量上线也为国内的研究人员和开发者提供了新的研究和开发平台。他们可以利用 abab6 来开发各种自然语言处理应用,例如机器翻译、文本摘要、对话生成、问答系统等。
MiniMax 表示,他们将继续致力于大语言模型的研究和开发,并计划在未来推出更多先进的大语言模型。
英语如下:
**China’s First MoE Large Language Model abab6 Fully Launched**
March 8, 2023, MiniMax, a leading AI company in China, announced the full launch of its large language model abab6, making it the first MoE large language model in China.
abab6 employs the MoE (Mixture of Experts) architecture, which decomposes the model parameters into multiple expert models, each responsible for handling different types of data or tasks. This architecture enables abab6 to improve the model’s computational efficiencyand generalization capability while maintaining its model capacity.
Under the MoE architecture, abab6 possesses the capability to handle complex tasks brought by large parameters, while the model can train enough data in a unit time, and the computational efficiency can also be significantly improved. This allows abab6 to achieve excellent performance on various natural language processing tasks.
On language understanding tasks, abab6 achieved an average score of 91.5% on the GLUE benchmark and an average score of 90.4% on the SuperGLUE benchmark, both ranking first. On language generation tasks, abab6 achieved an average scoreof 64.5% on the GPT-3 benchmark and an average score of 87.2% on the CoQA benchmark, both ranking first.
The full launch of abab6 marks a major breakthrough for MiniMax in the field of large language models. This will further promote the development of natural language processing technology and provide a more robust foundation for various AI applications.
The full launch of abab6 also provides a new research and development platform for researchers and developers in China. They can utilize abab6 to develop various natural language processing applications, such as machine translation, text summarization, dialogue generation, and question-answering systems.
MiniMax stated that they will continue to focus on the research and development of large language models and plan to launch more advanced large language models in the future.
【来源】https://mp.weixin.qq.com/s/2aFhRUu_cg4QFdqgX1A7Jg
Views: 1