李飞飞团队与谷歌合作，推出逼真视频生成模型W.A.L.T

近日，AI科学家李飞飞团队与谷歌联手推出了一款名为W.A.L.T(Window Attention Latent Transformer)的视频生成模型，该模型利用扩散模型技术，能够在共享潜在空间中训练图像和视频生成。这一消息引起了业界的广泛关注。

据悉，W.A.L.T模型是基于Transformer架构的扩散模型，它可以对图像和视频进行生成式建模。该模型通过自注意力机制来捕捉输入数据中的全局信息，并将其转化为潜在向量表示。同时，它还使用了一种称为“Latent Transformer”的技术来进一步增强生成效果。

据李飞飞介绍，W.A.L.T模型的优势在于其能够生成更加逼真的视频。与传统的视频生成方法相比，W.A.L.T模型不仅能够更好地保留视频的细节信息，还能够在不同的场景下生成更加自然的动态效果。此外，该模型还可以根据用户的需求进行定制化生成，从而满足不同应用场景的需求。

对于此次合作，李飞飞表示：“我们非常高兴能够与谷歌合作推出W.A.L.T模型。这个模型将会对视频生成领域产生深远的影响，为人们带来更加丰富多彩的视觉体验。”

目前，W.A.L.T模型已经在全球范围内得到了广泛的应用。未来，随着技术的不断发展和完善，相信这款模型将会在更多的领域发挥出其巨大的潜力。

英语如下：

Title: Li Feifei’s Team Collaborates with Google to Launch Realistic Video Generation Model W.A.L.T

Keywords: W.A.L.T model, Google collaboration, video generation

Recently, AI scientist Li Feifei’s team has partnered with Google to launch a video generation model called W.A.L.T (Window Attention Latent Transformer), which uses diffusion model technology to train image and video generation in a shared latent space. This news has attracted widespread attention from the industry.

It is reported that W.A.L.T model is based on the Transformer architecture of the diffusion model, which can generate images and videos. The model captures global information in the input data through self-attention mechanism and converts it into a latent vector representation. At the same time, it also uses a technique called “Latent Transformer” to further enhance the generation effect.

According to Li Feifei, the advantage of W.A.L.T model lies in its ability to generate more realistic videos. Compared with traditional video generation methods, W.A.L.T model not only better preserves the details of the video but also generates more natural dynamic effects in different scenarios. In addition, the model can be customized according to user needs, meeting the requirements of different application scenarios.

Regarding this collaboration, Li Feifei said, “We are very pleased to collaborate with Google to launch W.A.L.T model. This model will have a profound impact on the field of video generation and bring people more colorful visual experiences.”

At present, W.A.L.T model has been widely used worldwide. In the future, with the continuous development and improvement of technology, it is believed that this model will play a huge role in more fields.

【来源】https://new.qq.com/rain/a/20231212A04PMP00