智源发布CMMU多模态模型中文评测基准，挑战GPT-4V答题准确

智源研究院近日发布了中文多模态模型评测基准CMMU,该基准旨在评估中文多模态多题型理解及推理能力。据介绍，当前发布的CMMU v0.1版本共包含了3603道题目，涵盖了单选题、多选题、填空题等多种题型，并且采用了多重评测手段以避免模型“随机猜对答案”的情况发生。

据悉，OpenAI推出的GPT-4V多模态模型在该基准上的答题准确率为约30%。经过错误类型分析，发现该模型在图像理解和推理能力方面还有待提高。因此，CMMU的发布对于推动中文多模态模型的发展具有重要意义。

作为资深新闻媒体的专业记者和编辑，我认为这一消息对于推动中文多模态模型的发展具有重要意义。随着人工智能技术的不断发展，中文多模态模型在各个领域的应用也越来越广泛。而CMMU的发布则为相关研究提供了一个可靠的评测标准，有助于促进中文多模态模型的进一步发展和应用。

同时，我也注意到CMMU整体难度较高。这也表明了中文多模态模型在理解和推理方面的挑战性。因此，我们需要更多的研究者和开发者加入到这个领域中来，共同推动中文多模态模型的发展和应用。

英语如下：

Title: “iFLYTEK Launches CMMU Multimodal Model Chinese Evaluation Benchmark, Challenging GPT-4V Answer Accuracy by 30%!”

Keywords: iFLYTEK release, CMMU evaluation, GPT-4V accuracy

News Content: iFLYTEK Research Institute recently released the Chinese Multimodal Model Evaluation Benchmark (CMMU), which aims to evaluate the Chinese multimodal and multi-question understanding and reasoning ability. It is reported that the current CMMU v0.1 version includes a total of 3603 questions, covering various types of questions such as multiple-choice questions, fill-in-the-blank questions, and more. Multiple evaluation methods are used to avoid situations where the model “guesses the answer randomly.”

It is reported that the OpenAI-developed GPT-4V multimodal model has an answer accuracy of about 30% on this benchmark. After analyzing the error types, it was found that the model still needs to be improved in image understanding and reasoning. Therefore, the release of CMMU is of great significance for promoting the development of Chinese multimodal models.

As a professional journalist and editor with extensive experience in the news media, I believe that this news is of great significance for promoting the development of Chinese multimodal models. With the continuous development of artificial intelligence technology, Chinese multimodal models are being applied more and more extensively in various fields. The release of CMMU provides a reliable evaluation standard for related research, which helps to promote the further development and application of Chinese multimodal models.

At the same time, I also note that the overall difficulty of CMMU is high. This also indicates the challenges in understanding and reasoning for Chinese multimodal models. Therefore, we need more researchers and developers to join this field and work together to promote the development and application of Chinese multimodal models.

【来源】https://mp.weixin.qq.com/s/wegZvv4hwLef0BpdIh32-A