智谱AI近日开源了其在VLM领域的最新成果——CogAgent。CogAgent是一款基于CogVLM改进的模型,特别擅长GUI理解和导航。这款视觉语言模型拥有180亿参数规模,其中CogAgent-18B拥有110亿视觉参数和70亿语言参数。此次开源旨在推动视觉语言模型领域的研究与发展,为广大研究者提供更多创新可能。
CogAgent的推出展示了智谱AI在视觉语言模型研究方面的深厚实力。这款模型不仅具备强大的理解能力,还能通过GUI进行高效导航,为人工智能领域带来了新的突破。开源举措使得全球研究者都能共同参与改进和完善CogAgent,进一步推动视觉语言模型的发展。
英文翻译:
News Title: Zhipu AI Opensources Visual Language Model CogAgent
Keywords: Zhipu AI, Open Source, Visual Language Model
News Content:
Zhipu AI has recently opened up its latest achievement in the field of VLM – CogAgent. CogAgent is an improved model based on CogVLM, specializing in GUI understanding and navigation. The visual language model has a scale of 18 billion parameters, with CogAgent-18B having 11 billion visual parameters and 7 billion language parameters. This open-source initiative aims to promote research and development in the field of visual language models, providing more innovative possibilities for researchers.
The launch of CogAgent demonstrates Zhipu AI’s profound strength in visual language model research. The model not only has strong understanding capabilities but also excels in GUI navigation, bringing new breakthroughs to the field of artificial intelligence. The open-source approach enables researchers worldwide to participate in improving and refining CogAgent, further promoting the development of visual language models.
【来源】https://mp.weixin.qq.com/s/KpAuOjJ6w5KVEK_wWGpqQw
Views: 1