智谱AI近日开源了其在VLM领域的最新工作——CogAgent。CogAgent是一个基于CogVLM改进的模型,拥有180亿参数规模,专长于GUI理解和导航。其中的CogAgent-18B拥有110亿视觉参数和70亿语言参数。这一举动展示了智谱AI在视觉语言模型领域的强大实力。

CogAgent的开放源代码将为广大研究人员和开发者提供一个强大的工具,助力他们在人工智能领域的研究。这款模型不仅可以理解图形界面,还能进行导航,为各类应用场景提供了广阔的可能性。

此次开源行动是智谱AI持续推动人工智能发展的重要举措。CogAgent的问世将进一步推动视觉语言模型领域的研究,助力我国在人工智能领域保持领先地位。

英文翻译:
News Title: Zhipu AI opensources visual language model CogAgent
Keywords: Zhipu AI, CogAgent, Visual language model

News Content:
Zhipu AI recently opened the source code of its latest work in the VLM field, called CogAgent. CogAgent is an improved model based on CogVLM, with a scale of 18 billion parameters, specializing in GUI understanding and navigation. Among them, CogAgent-18B has 11 billion visual parameters and 7 billion language parameters. This move demonstrates Zhipu AI’s strong presence in the field of visual language models.

The open-sourcing of CogAgent will provide a powerful tool for researchers and developers, enabling them to conduct advanced studies in the field of artificial intelligence. This model can not only understand graphical interfaces but also navigate, opening up wide possibilities for various application scenarios.

This action is another important step for Zhipu AI to continuously promote the development of artificial intelligence. The introduction of CogAgent will further drive research in the field of visual language models and help maintain China’s leading position in the field of artificial intelligence.

【来源】https://mp.weixin.qq.com/s/KpAuOjJ6w5KVEK_wWGpqQw

Views: 4

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注