北大开源巨擘：aiXcoder-7B，企业级代码生成模型打破纪录

作者智能小编

4 月 19, 2024 #代码大模型, #企业部署, #北大aiXcoder, #每日AI快讯

北京大学的aiXcoder团队近日发布了一项重大科研成果——aiXcoder-7B Base版代码大模型，这是一款专为企业私有部署而设计的开源工具。该模型在业界引起了广泛关注，因为它在1.2T Unique Tokens的数据集上进行了深度训练，确保了其在实际软件开发环境中的高效应用。

aiXcoder-7B Base的独特之处在于其预训练任务和上下文信息设计，这些都针对真实的代码生成需求进行了优化。在HumanEval、MBPP和MultiPL-E三大权威评测集上，该模型的表现超越了拥有340亿参数的Codellama，显示出其在代码补全任务上的优越性能。此外，在多语言NL2Code基准测试中，aiXcoder 7B Base的平均表现也优于Codellama 34B和StarCoder2 15B，进一步确立了其在同等级参数量模型中的领先地位。

这一创新的开源模型将为企业的软件开发带来革命性的改变，提供更智能、更高效的代码生成解决方案。北京大学aiXcoder团队的这一成就，不仅体现了中国在人工智能领域的研发实力，也为全球软件开发领域贡献了宝贵的技术资源。企业可以借此提升开发效率，降低出错率，从而在数字化转型的道路上迈出更加坚实的一步。

英语如下：

**News Title:** “Peking University’s Open-Source Giant: aiXcoder-7B, a Record-Breaking Enterprise-Level Code Generation Model”

**Keywords:** Peking University aiXcoder, Large Code Model, Enterprise Deployment

**News Content:** The aiXcoder team from Peking University recently unveiled a major research breakthrough with the aiXcoder-7B Base, a large code generation model specifically designed for enterprise-level private deployment. This open-source tool has drawn significant attention in the industry due to its extensive training on a 1.2T Unique Tokens dataset, ensuring its effectiveness in real-world software development environments.

The uniqueness of aiXcoder-7B Base lies in its pre-training tasks and contextual information design, both optimized for genuine code generation requirements. Outperforming Codellama, a model with 340 billion parameters, on the HumanEval, MBPP, and MultiPL-E benchmark tests, it demonstrates superior performance in code completion tasks. Furthermore, in the multilingual NL2Code benchmark, aiXcoder 7B Base’s average performance surpasses Codellama 34B and StarCoder2 15B, solidifying its position as a leader among models with comparable parameter counts.

This innovative open-source model is set to revolutionize software development in enterprises, offering smarter and more efficient code generation solutions. The aiXcoder team’s achievement from Peking University not only showcases China’s prowess in AI research but also contributes valuable technological resources to the global software development landscape. By leveraging this model, companies can enhance development efficiency, reduce error rates, and make more robust strides in their digital transformation journeys.

【来源】https://www.qbitai.com/2024/04/134070.html