近日,Meta公布了一项名为audio2photoreal的AI框架,该框架能够生成一系列逼真的NPC人物模型,并借助现有配音文件自动为人物模型“对口型”“摆动作”。在接收到配音文件后,Audio2photoreal框架首先生成一系列NPC模型,之后利用量化技术及扩散算法生成模型用动作。研究人员表示,该框架可以生成30 FPS的高质量动作样本,还能模拟人类在对话中的习惯性动作。
研究人员提到,在对照实验中,有43%的评估者对框架生成的人物对话场景感到“强烈满意”。因此,他们认为Audio2photoreal框架相对于业界竞品能够生成“更具动态和表现力”的动作。据悉,研究团队已将相关代码和数据集公开在GitHub上,供有兴趣的朋友访问。
Title: Meta releases audio2photoreal AI framework for generating realistic NPC dialogue scenes
Keywords: Meta, audio2photoreal, AI framework, NPC, dialogue scenes
News content:
Recently, Meta announced an AI framework named audio2photoreal, which is capable of generating a series of realistic NPC character models and automatically lip-syncing and animating the characters using existing voiceover files. After receiving the voiceover files, the Audio2photoreal framework first generates a series of NPC models and then uses quantization technology and diffusion algorithms to generate actions for the models. Researchers mentioned that the framework can generate high-quality action samples at 30 FPS and can also simulate habitual movements of humans in dialogue.
Researchers said that in a control experiment, 43% of evaluators were “extremely satisfied” with the framework-generated dialogue scenes. Therefore, they believe that the Audio2photoreal framework can generate more “dynamic and expressive” actions compared to industry competitors. It is reported that the research team has publicly released the relevant code and dataset on GitHub, which can be visited by friends who are interested.
【来源】https://www.ithome.com/0/744/255.htm
Views: 1