新一代书生·视觉大模型开源领先

近日，我国上海人工智能实验室（Shanghai AI Laboratory）联合清华大学、香港中文大学、商汤科技等机构开源新一代书生·视觉大模型（InternVL）。这一举措标志着我国在视觉核心任务领域取得了重要突破。

新一代“书生·视觉基础”模型的视觉编码器参数量达60亿（InternVL-6B），首次提出了对比-生成融合的渐进式对齐技术，实现了在互联网级别数据上视觉大模型与语言大模型的精细对齐。这一技术的问世，不仅提升了模型在视觉任务上的表现，也为我国人工智能领域的研究和发展提供了强有力的支持。

此次开源的新一代书生·视觉大模型，是在原有基础上进行的重大升级。其参数量的大幅提升，使得模型具备了更强的学习和理解能力，进一步提高了视觉任务的准确率和效率。对比-生成融合的渐进式对齐技术，则是该书生·视觉大模型的另一亮点。通过对海量互联网数据进行精细对齐，模型在视觉和语言领域的协同表现得到了显著提升。

作为我国人工智能领域的一项重要成果，新一代书生·视觉大模型的开源，将有助于推动国内外的研究和发展，为各类应用场景提供更多可能性。未来，上海AI实验室将继续深化与各合作伙伴的合作，致力于在人工智能领域取得更多突破，为我国科技创新和社会进步贡献力量。

英文翻译：

News Title: Next-Generation Scholar Visual Large Model Opens Source Leadership
Keywords: Shanghai AI Laboratory, Scholar Visual Large Model, Open Source

News Content:

Recently, the Shanghai AI Laboratory, in collaboration with Tsinghua University, Chinese University of Hong Kong, and SenseTime, has opened up the next-generation Scholar Visual Large Model (InternVL). This achievement marks an important breakthrough in the field of visual core tasks in China.

The next-generation “Scholar Visual Foundation” model features a visual encoder with 6 billion parameters (InternVL-6B), and it proposes a progressive alignment technology that combines comparison and generation, achieving fine alignment between visual and language models at internet scale. This technological advancement not only enhances the model’s performance in visual tasks but also provides a strong boost to research and development in China’s artificial intelligence field.

The newly opened-up next-generation Scholar Visual Large Model represents a significant upgrade from the previous version. With a substantial increase in the number of parameters, the model gains stronger learning and understanding abilities, further improving accuracy and efficiency in visual tasks. The progressive alignment technology that combines comparison and generation is another highlight of the Scholar Visual Large Model. By conducting fine alignment of massive internet data, the model’s collaborative performance in both visual and language fields has been significantly enhanced.

As an important achievement in China’s artificial intelligence field, the open-source of the next-generation Scholar Visual Large Model will contribute to domestic and international research and development, offering more possibilities for various application scenarios. In the future, the Shanghai AI Laboratory will continue to deepen cooperation with partner institutions to strive for more breakthroughs in the field of artificial intelligence, contributing to scientific and technological innovation and social progress in China.

【来源】https://mp.weixin.qq.com/s/bdfAJRqOF9tUk8Vy9KC_XQ