UC伯克利推出开源世界模型LargeWorldModel

作者智能小编

2 月 22, 2024 #多模态信息处理, #开源世界模型, #每日AI快讯

上海的陆家嘴

近日，美国加州大学伯克利分校（UC Berkeley）推出了一款名为LargeWorldModel（LWM）的开源世界模型，该模型支持处理百万token上下文，并能够生成视频。这一模型目前在GitHub上热榜榜首，成为最新的开源世界模型。

据悉，LWM的上下文窗口长度达到了100万token，与谷歌同时推出的Gemini 1.5持平。该模型在命名上简单粗暴，直接以“大世界模型”命名，没有任何额外点缀。LWM的最大亮点是支持处理多模态信息，能够在100万token中准确找到目标文本，还能一口气看完1小时的视频。

这款模型的问世，标志着人工智能技术在处理大规模上下文信息方面的重大突破。此前，此类模型多用于处理较短的文本信息，而在处理大规模文本信息方面，LWM无疑走在了前列。这款模型的开源性质，也有望推动全球人工智能研究社区的进一步创新。

英文标题：UC Berkeley Unveils Open-Source World Model LargeWorldModel
英文关键词：Open-source world model, Multimodal information processing, LargeWorldModel

英文新闻内容：
Recently, the University of California, Berkeley (UC Berkeley) has launched a new open-source world model called LargeWorldModel (LWM), which supports processing millions of token contexts and can generate videos. This model is currently topping the GitHub hotlist, becoming the latest open-source world model.

According to reports, the context window length of LWM reaches 10 million tokens, on par with Google’s simultaneously released Gemini 1.5. The model is named simply and brutally as “LargeWorldModel,” without any additional decorations. One of the biggest highlights of LWM is its support for multimodal information processing, enabling it to accurately find target texts within 10 million tokens and watch a 1-hour video in one breath.

The emergence of this model signifies a significant breakthrough in artificial intelligence technology for handling large-scale context information. Previously, such models were mostly used for processing shorter text information, and in the field of handling large-scale text information, LWM无疑走到了前列。The open-source nature of this model is also expected to promote further innovation in the global artificial intelligence research community.

【来源】https://mp.weixin.qq.com/s/52uUGcgcoT6oGhZvi-Dl-w