最新消息最新消息

正文:

近日,阿里巴巴研究院发布了一款名为「轨迹可控版Sora」的新视频生成技术,该技术颠覆了传统的视频生成方式,告别了以往的「抽卡」模式,使得视频生成更加符合物理规律。用户只需指定路线,新系统「Tora」便能生成相应轨迹的视频,这一突破性进展在人工智能视频生成领域引起了广泛关注。

「轨迹可控版Sora」采用了先进的Diffusion Transformer(DiT)架构,能够生成从10秒到60秒的高质量视频,并且能够适应不同的分辨率、纵横比,同时还能模拟物理世界的运动。与传统的U-Net架构相比,Sora能够生成更长时长的视频,并且不受固定分辨率和纵横比的限制。

为了进一步提升视频生成的可控性和保真度,阿里巴巴的研究者提出了「Tora」这一面向轨迹的DiT架构。「Tora」将文本、视觉和轨迹条件同时集成在一起,能够精确控制视频内容的不同持续时间、宽高比和分辨率。通过大量实验证明,「Tora」在实现高运动保真度方面表现出色,同时还能细致模拟物理世界的运动。

「Tora」技术的一个显著特点是它能够处理可变持续时间的视频,这对于那些需要灵活控制视频长度的应用场景来说是一个巨大的进步。此外,「Tora」还采用了轨迹提取器和运动引导融合器,这两个组件使得「Tora」能够更加精准地控制视频中的物体运动轨迹,避免了传统方法中常见的物体变形问题。

「轨迹可控版Sora」和「Tora」的发布,不仅展示了阿里巴巴在人工智能领域的创新实力,也为视频内容创作提供了新的工具和可能性。随着技术的不断进步,未来我们有望看到更多基于「轨迹可控版Sora」和「Tora」的视频作品,为用户带来更加丰富多彩的视觉体验。

英语如下:

Title: “Alibaba Breaks New Ground with Trajectory-Controllable Video Generation Technology”

Keywords: Sora, Tora, DiT

News Content:

Recently, the Alibaba Research Institute released a new video generation technology dubbed “Trajectory-Controllable Sora,” which has broken new ground in the generation of videos by transcending physical laws and pushing the boundaries of video creation.

The technology, known as “Trajectory-Controllable Sora,” utilizes an advanced Diffusion Transformer (DiT) architecture, capable of generating high-quality videos from 10 seconds to 60 seconds and adapting to various resolutions and aspect ratios, while also simulating the movements of the physical world. Unlike the traditional U-Net architecture, Sora can produce longer videos and is not limited by fixed resolutions and aspect ratios.

To further enhance the controllability and fidelity of video generation, researchers at Alibaba proposed the “Tora” DiT architecture tailored for trajectories. “Tora” integrates text, visual, and trajectory conditions, enabling precise control over the duration, aspect ratio, and resolution of video content. Through extensive experiments, “Tora” has demonstrated excellent performance in achieving high motion fidelity, as well as in simulating the movements of the physical world.

A notable feature of “Tora” is its ability to handle variable-duration videos, which represents a significant leap forward for applications that require flexible control over video length. Additionally, “Tora” incorporates a trajectory extractor and a motion guidance integrator, making it capable of more accurately controlling the object motion trajectories within videos and avoiding the common object deformation issues found in traditional methods.

The release of “Trajectory-Controllable Sora” and “Tora” not only showcases Alibaba’s innovative strength in the field of artificial intelligence but also provides new tools and possibilities for video content creation. As technology continues to advance, we can look forward to seeing more video works based on “Trajectory-Controllable Sora” and “Tora,” offering users a richer and more diverse visual experience.

【来源】https://www.jiqizhixin.com/articles/2024-08-04-8

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注