Qwen2-Math 数学推理新霸主，超越GPT-4o与Clau

正文：

近日，阿里巴巴旗下的Qwen团队宣布开源其最新的数学推理模型——Qwen2-Math系列。这一系列模型基于Qwen2语言模型架构构建，包括1.5B、7B和72B三个不同规模的模型，旨在解决复杂的数学问题。据称，Qwen2-Math在数学推理方面的能力已超越了目前最先进的模型，如GPT-4、Claude-3.5等，成为全球领先的数学推理模型。

Qwen2-Math模型的研发团队在过去一年中专注于提升大模型的推理能力，特别是在数学相关任务上的表现。通过在精心设计的数学专用语料库上进行预训练，Qwen2-Math在多个数学基准评测上取得了优异的成绩。在英语数学基准GSM8K、Math和MMLU-STEM上，以及在中文数学基准CMATH、GaoKao Math Cloze和GaoKao Math QA上，Qwen2-Math-72B-Instruct模型均超越了最先进的模型。

Qwen2-Math-Instruct模型通过指令微调，能够在英语和中文的数学基准评测上表现出强大的能力。除了常见的评测集，如GSM8K和MATH，Qwen2-Math-Instruct还能够解决更具挑战性的数学问题，如奥林匹克数学竞赛题、高考数学题等。

Qwen2-Math模型的开源，将为科学界解决复杂多步逻辑推理的高级数学问题提供强大的工具。该模型的推出，也标志着人工智能在数学领域应用的一个新里程碑。

英语如下：

News Title: “Qwen2-Math: The New Dominant in Mathematical Reasoning, Surpassing GPT-4o and Claude-3.5”

Keywords: Qwen2-Math, Mathematical Reasoning, Surpassing GPT-4o

News Content:

Title: Alibaba Announces Open-Source of Qwen2-Math Model, Leading Global in Mathematical Reasoning

Recently, the Qwen team under Alibaba Group announced the open-source of their latest mathematical reasoning model – the Qwen2-Math series. This series of models is built on the Qwen2 language model architecture, including models of different scales such as 1.5B, 7B, and 72B, aiming to solve complex mathematical problems. It is claimed that the Qwen2-Math has surpassed the current most advanced models, such as GPT-4 and Claude-3.5, becoming the leading mathematical reasoning model globally.

The development team of Qwen2-Math has focused on enhancing the reasoning capabilities of large models, particularly in mathematical tasks, over the past year. By pre-training on a carefully designed mathematical specialized corpus, Qwen2-Math has achieved outstanding results in multiple mathematical benchmark evaluations. On English mathematical benchmarks GSM8K, Math, and MMLU-STEM, as well as on Chinese mathematical benchmarks CMATH, GaoKao Math Cloze, and GaoKao Math QA, the Qwen2-Math-72B-Instruct model outperformed the state-of-the-art models.

The Qwen2-Math-Instruct model, through instruction fine-tuning, demonstrates strong capabilities in English and Chinese mathematical benchmark evaluations. Beyond common test sets like GSM8K and MATH, the Qwen2-Math-Instruct can also solve more challenging mathematical problems, such as Olympiad Mathematics competition questions and college entrance examination math problems.

The open-source of the Qwen2-Math model will provide a powerful tool for the scientific community to tackle complex multi-step logical reasoning advanced mathematical problems. This release also marks a new milestone in the application of artificial intelligence in the field of mathematics.

【来源】https://qwenlm.github.io/zh/blog/qwen2-math/