**国产大模型在高考模拟考中展现实力:文科突破一本线**

近日,一场特殊的高考引起了广泛关注。不同于常规的学生参加高考,这场考试的主角是人工智能大模型。在最新发布的高考新课标Ⅰ卷大模型评测报告中,GPT-4o以562分的高分在文科总排名中位居首位。

在众多国内产品中,字节跳动旗下的豆包表现尤为出色,成绩为542.5分,成功达到了河南高考文科一本线。百度文心一言4.0和百川智能的“百小应”也表现出色,分别获得537.5分和521分。这一成绩在文科领域尤为突出,证明国产大模型在语言和文学创作上的能力已达到较高水平。

理科考试方面,大模型表现稍逊,多数模型的分数未达到河南理科的一本线标准。然而,专家指出这是可以预见的情况,与常规学生高考的文理学科对比也不具备对等性比较价值。毕竟人工智能在处理逻辑推理、数学运算等方面仍需进一步的进步和优化。但在语言类考试中,大模型展现出巨大的潜力,特别是在语文和英语考试中表现出惊人的表现。例如,在语文考试中,“百小应”、“字节豆包”和“腾讯元宝”凭借出色的语言能力赢得了前三名。专家认为这一结果标志着大模型在处理自然语言理解方面已取得重要进展。但值得注意的是,大模型在语文写作方面仍有提升空间。总体而言,此次评测不仅展示了人工智能在教育领域的潜力,也为未来的发展方向提供了宝贵的参考依据。

英语如下:

News Title: “Big Model’s First College Entrance Examination Performance: Domestic AI Sprints to the First Grade Line, GPT-4o Leads the Liberal Arts Ranking”

Key Word: 1. Big Model College Entrance Examination (Gaokao)

News Content: **Domestic Big Models Demonstrate Strength in Gaokao Simulation Exam: Liberal Arts Breakthrough**

Recently, a special college entrance examination has attracted widespread attention. Unlike the conventional exam where students take the test, the main characters of this exam are artificial intelligence big models. In the latest released evaluation report of the new curriculum standard I Gaokao big model testing, GPT-4o topped the liberal arts general ranking with a high score of 562 points.

Among numerous domestic products, ByteDance’s Doubao performed particularly well with a score of 542.5, successfully reaching the Henan Gaokao liberal arts first-tier line. Baidu’s Wenxin Yiyan 4.0 and Baidu’s “Baixiaoying” from Baichuan Intelligence also showed excellent performance, scoring 537.5 and 521 respectively. This performance is particularly noteworthy in the liberal arts field, demonstrating that domestic big models have reached a high level in language and literary creation abilities.

In the science exam, the big models performed slightly less well, and most models did not reach the first-tier line standard for Henan science. However, experts pointed out that this was a foreseeable situation, and it did not have equivalent comparison value with the comparison between liberal arts and science subjects in conventional student exams. After all, artificial intelligence still needs further progress and optimization in dealing with logical reasoning and mathematical calculations. Nevertheless, in language exams, big models have shown tremendous potential, particularly in Chinese and English exams. For example, in the Chinese exam, “Baixiaoying”, “Byte Doubao”, and “Tencent Yuanbao” won the top three spots with excellent language abilities. Experts believed that this result marked significant progress in big models’ natural language processing capabilities. It is worth noting that there is still room for improvement in big models’ Chinese writing abilities. Overall, this evaluation not only showcases the potential of artificial intelligence in the education field but also provides valuable reference for future development directions.

【来源】https://www.jiqizhixin.com/articles/2024-06-25

Views: 9

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注