Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

近日,Meta公布了一项名为audio2photoreal的AI框架,该框架能够生成一系列逼真的NPC人物模型,并借助现有配音文件自动为人物模型“对口型”“摆动作”。在接收到配音文件后,Audio2photoreal框架首先生成一系列NPC模型,之后利用量化技术及扩散算法生成模型用动作。研究人员表示,该框架可以生成30 FPS的高质量动作样本,还能模拟人类在对话中的习惯性动作。

研究人员提到,在对照实验中,有43%的评估者对框架生成的人物对话场景感到“强烈满意”。因此,他们认为Audio2photoreal框架相对于业界竞品能够生成“更具动态和表现力”的动作。据悉,研究团队已将相关代码和数据集公开在GitHub上,供有兴趣的朋友访问。

Title: Meta releases audio2photoreal AI framework for generating realistic NPC dialogue scenes
Keywords: Meta, audio2photoreal, AI framework, NPC, dialogue scenes

News content:

Recently, Meta announced an AI framework named audio2photoreal, which is capable of generating a series of realistic NPC character models and automatically lip-syncing and animating the characters using existing voiceover files. After receiving the voiceover files, the Audio2photoreal framework first generates a series of NPC models and then uses quantization technology and diffusion algorithms to generate actions for the models. Researchers mentioned that the framework can generate high-quality action samples at 30 FPS and can also simulate habitual movements of humans in dialogue.

Researchers said that in a control experiment, 43% of evaluators were “extremely satisfied” with the framework-generated dialogue scenes. Therefore, they believe that the Audio2photoreal framework can generate more “dynamic and expressive” actions compared to industry competitors. It is reported that the research team has publicly released the relevant code and dataset on GitHub, which can be visited by friends who are interested.

【来源】https://www.ithome.com/0/744/255.htm

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注