智源研究院发布大模型评测结果：顶尖闭源模型挑战海淀学生平均水平

**智源研究院发布大模型评测结果：闭源大模型表现亮眼，但学科能力略逊于海淀学生平均水平**

近日，智源研究院隆重推出其全新的智源评测体系，并于5月17日举办了大模型评测发布会。经过全方位的能力评测，发布了国内外140余个开源和商业闭源语言及多模态大模型的评估结果。

为确保评测的公正与全面，智源研究院巧妙地结合了海淀区教师进修学校为中小学三至高三学段编订的新课标内容。这些试卷覆盖了语文、数学、英语、物理、化学、历史共六个学科，总计涉及45套试卷和高达1400道题目。通过这样的测试模式，大模型们经历了颇为严格的挑战。

最终的结果令人瞩目。在综合各年级、各学科的综合得分率中，前五名均为闭源大模型。它们分别是通义Qwen-vl-max、百度文心一言4.0、智谱华章GLM-4、百川智能Baichuan3以及GPT-4。这些大模型展现了卓越的性能和潜力。

然而，在更为具体的学科测验上，这些大模型的表现却略低于海淀各年级学生的平均水平。这反映出在特定领域的深度应用上，人工智能仍然面临着不小的挑战。尽管有如此的成果与发现，但也预示着人工智能与人类的竞赛仍将继续深化，未来可期。智源研究院的这次评测不仅为行业提供了宝贵的参考，也为未来的人工智能发展铺设了基石。

英语如下：

News Title: “ZhiYuan Institute of Technology Releases Evaluation Results of Large Models: Top Closed-Source Models Challenge the Average Level of Students in Haidian School District”

Keywords: ZhiYuan Evaluation System, Evaluation Results of Large Models, Academic Test Levels

News Content:

Recently, the ZhiYuan Institute of Technology unveiled its new ZhiYuan Evaluation System and held a large model evaluation conference on May 17th. After comprehensive ability evaluations, evaluation results of more than 140 open-source and commercial closed-source language and multi-modal large models were released.

To ensure fair and comprehensive evaluations, the ZhiYuan Institute of Technology cleverly combined the new curriculum content compiled by the Haidian Teacher Training School for students in grades three to grade three of secondary schools. These examination papers cover six subjects including Chinese, mathematics, English, physics, chemistry, and history, totaling 45 sets of examination papers and 1,400 questions. Through this testing mode, the large models underwent strict challenges.

The final results are eye-catching. In the overall score rate across all grades and subjects, the top five are closed-source large models, namely Tongyi Qwen-vl-max, Baidu Wenxin Yiyi 4.0, ZhiPu Huachang GLM-4, Baichuan Intelligence Baichuan3, and GPT-4. These large models have demonstrated excellent performance and potential.

However, in more specific subject tests, these large models’ performance was slightly lower than the average level of students in various grades in the Haidian district. This reflects that in terms of deep applications in specific fields, artificial intelligence still faces considerable challenges. Despite these achievements and discoveries, it indicates that the competition between artificial intelligence and humans will continue to deepen, and the future is promising. The evaluation by the ZhiYuan Institute of Technology not only provides a valuable reference for the industry but also lays a foundation for future development of artificial intelligence.

【来源】https://ai-bot.cn/go/?url=aHR0cHM6Ly93d3cuamllbWlhbi5jb20vYXJ0aWNsZS8xMTE4NjY2OS5odG1s