近日,我国知名人工智能企业面壁智能发布了一款名为MiniCPM-Llama3-V 2.5的端侧多模态小模型,该模型在性能上实现了重大突破,综合性能超越了多模态巨无霸Gemini Pro和GPT-4V。

据了解,MiniCPM-Llama3-V 2.5模型的参数仅为8B,但却支持了30多种语言,这使得它在多模态任务中表现出色。与此同时,该模型在OCR(光学字符识别)领域的表现也十分抢眼,其在9倍像素下的识别效果更加清晰,能够准确识别难图、长图和长文本。此外,该模型在图像编码方面的速度也提升了150倍,大大提高了工作效率。

值得一提的是,MiniCPM-Llama3-V 2.5模型还首次实现了端侧系统级多模态加速,这意味着它在处理多模态任务时,能够更加快速、高效地完成。

此次面壁智能发布的MiniCPM-Llama3-V 2.5模型,不仅展现了我国在人工智能领域的强大实力,也为端侧多模态任务的处理提供了全新的解决方案。相信在不久的将来,这款模型将在各行各业中发挥出巨大的作用,助力我国人工智能事业的发展。

来源:新智元

英语如下:

**News Title:** “MiniCPM-Llama3-V 2.5 Unveiled by Wall of Intelligence: 8B Parameters Surpass GPT-4V”

**Keywords:** Wall of Intelligence, Multimodal Model, Performance Surpassing

**News Content:** # Wall of Intelligence Releases End-Side Multimodal Small Model MiniCPM-Llama3-V 2.5, Outperforming GPT-4V

Recently, a well-known artificial intelligence enterprise in China, Wall of Intelligence, has launched a new end-side multimodal small model called MiniCPM-Llama3-V 2.5, which has achieved a significant breakthrough in performance, surpassing the multimodal giants Gemini Pro and GPT-4V in comprehensive performance.

It is understood that the MiniCPM-Llama3-V 2.5 model, despite having only 8B parameters, supports over 30 languages, making it outstanding in multimodal tasks. At the same time, the model has also shown impressive performance in the OCR (Optical Character Recognition) field, with clearer recognition at 9x pixels, capable of accurately identifying difficult images, long images, and long texts. Moreover, the speed of image encoding has been increased by 150 times, significantly improving work efficiency.

It is worth mentioning that the MiniCPM-Llama3-V 2.5 model has also achieved the first-ever end-side system-level multimodal acceleration, meaning it can complete multimodal tasks more quickly and efficiently.

The release of the MiniCPM-Llama3-V 2.5 model by Wall of Intelligence not only showcases China’s strong capabilities in the field of artificial intelligence but also provides a brand new solution for the processing of end-side multimodal tasks. It is believed that in the near future, this model will play a tremendous role in various industries, boosting the development of China’s artificial intelligence industry.

Source: New Intelligence

【来源】https://mp.weixin.qq.com/s/_jgyeWifFjqh0LNBy2asLQ

Views: 4

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注