新一代多模态基础模型Emu2亮相

作者智能小编

2 月 2, 2024 #多模态, #性能突破, #每日AI快讯

智源研究院近日宣布开源发布新一代多模态基础模型Emu2。该模型通过大规模自回归生成式多模态预训练，实现了多模态上下文学习能力的显著突破。令人瞩目的是，Emu2在少样本多模态理解任务上大幅度超越了Flamingo-80B、IDEFICS-80B等主流多模态预训练大模型。

在包括VQAv2、OKVQA、MSVD、MM-Vet、TouchStone在内的多项少样本理解、视觉问答、主体驱动图像生成等任务上，Emu2均取得了最优性能。这一成果无疑为多模态研究领域带来了新的启示和突破。

英文翻译：

News Title: Next-Generation Multimodal Basic Model Emu2 Unveiled
Keywords: Multimodal, Basic Model, Performance Breakthrough

News Content:

The Beijing Academy of Artificial Intelligence recently announced the open-source release of the next-generation multimodal basic model Emu2. Through large-scale autoregressive generative multimodal pre-training, the model has achieved significant breakthroughs in multimodal context learning capabilities. Notably, Emu2 significantly surpasses mainstream multimodal pre-trained models such as Flamingo-80B and IDEFICS-80B in few-shot multimodal understanding tasks.

In various tasks such as few-shot understanding, visual question answering, and subject-driven image generation, including VQAv2, OKVQA, MSVD, MM-Vet, and TouchStone, Emu2 has achieved optimal performance. This achievement无疑为多模态研究领域带来了新的启示和突破。

【来源】https://mp.weixin.qq.com/s/Xf4xBzYwubVd8Lpw68ikDA