微软AI语音服务升级：文本变视频，虚拟人开口说话

##微软 Azure AI 语音服务推出虚拟人形象，支持文本转视频

**IT之家 8 月 23 日消息**，微软 AzureAI 语音服务近日推出了一项名为“Text to Speech Avatar”的新功能，允许开发者将简单的文本转换为人类自然说话的视频，并为其用户创建个性化虚拟人形象。

这项新功能是微软 Azure AI 语音服务的一部分，该服务允许开发者构建多语言生成式 AI语音应用。此前，该服务已经能够将文本转换为语音，而新推出的虚拟人功能则更进一步，将语音与视频结合，为用户提供更生动的体验。

据悉，Text to Speech Avatar功能能够输出 1920 x 1080 分辨率、每秒 25 帧的视频，并提供多种人物预设形象。用户可以选择不同的形象，并通过 Azure AI 文本转语音功能为其设定声音。此外，该功能还支持批量合成 API，可以异步或实时合成文本到语音人像视频。

为了方便开发者创建视频内容，微软还提供了 Speech Studio 工具，无需编码即可创建视频内容。Speech Studio 还提供实时聊天头像工具，可以实现实时人像对话。

目前，Text to Speech Avatar 功能已在东南亚、北欧、西欧、瑞典中部、美国中南部和美国西部地区推出。定价方面，该服务将根据视频输出的长度计算并按秒收费。

这项新功能的推出，标志着微软在 AI 语音服务领域迈出了重要一步。它不仅为开发者提供了更强大的工具，也为用户带来了更丰富的体验。未来，随着 AI 技术的不断发展，虚拟人形象将会变得更加逼真，应用范围也将更加广泛。

**以下是一些 Text to Speech Avatar 功能的潜在应用场景：**

* **教育：** 创建虚拟教师，为学生提供个性化的学习体验。
* **娱乐：** 创建虚拟主播，为用户提供更生动的直播内容。
* **客服：** 创建虚拟客服，为用户提供更便捷的服务。
* **营销：** 创建虚拟代言人，为品牌推广提供更具吸引力的方式。

微软 Azure AI 语音服务推出虚拟人形象，无疑将为 AI 技术应用带来新的可能性。未来，我们可以期待看到更多基于 AI 语音服务的创新应用，为我们的生活带来更多便利和乐趣。

英语如下：

##Microsoft Upgrades AI Speech Service: Text-to-Video, Virtual AvatarsSpeak

**Keywords:** Virtual Avatar, Text-to-Video, Microsoft

**News Content:**

## Microsoft Azure AI Speech Service Introduces Virtual Avatars, Enabling Text-to-Video Conversion

**IT Home, August23rd:** Microsoft Azure AI Speech Service recently launched a new feature called “Text to Speech Avatar,” allowing developers to transform simple text into videos featuring human-like natural speech and create personalized virtual avatars for their users.

This new feature is part of Microsoft Azure AI Speech Service, which enables developers to build multilingual generative AI speech applications. Previously, the service could convert text to speech, but thenewly launched virtual avatar functionality takes it a step further by combining speech with video, providing users with a more immersive experience.

The Text to Speech Avatar feature can output videos in 1920 x 1080 resolution at25 frames per second and offers various preset avatar options. Users can choose different avatars and set their voices using Azure AI Text-to-Speech functionality. Additionally, the feature supports batch synthesis API, allowing for asynchronous or real-time text-to-speech avatar video synthesis.

To facilitate video content creation fordevelopers, Microsoft provides the Speech Studio tool, enabling video content creation without coding. Speech Studio also offers a real-time chat avatar tool, enabling real-time avatar conversations.

Currently, the Text to Speech Avatar feature is available in Southeast Asia, Northern Europe, Western Europe, Central Sweden, the US South Central,and the US West regions. Pricing is based on the length of the video output and charged per second.

The introduction of this new feature marks a significant step forward for Microsoft in the field of AI speech services. It not only provides developers with more powerful tools but also offers users a richer experience. In the future,as AI technology continues to evolve, virtual avatars will become more realistic, and their applications will become more extensive.

**Here are some potential application scenarios for the Text to Speech Avatar feature:**

* **Education:** Create virtual teachers to provide personalized learning experiences for students.
* **Entertainment:** Create virtual anchors to delivermore engaging live content to users.
* **Customer Service:** Create virtual customer service representatives to provide users with more convenient service.
* **Marketing:** Create virtual brand ambassadors to offer more compelling brand promotion methods.

The introduction of virtual avatars by Microsoft Azure AI Speech Service undoubtedly brings new possibilities to AI technology applications.In the future, we can expect to see more innovative applications based on AI speech services, bringing more convenience and enjoyment to our lives.

【来源】https://www.ithome.com/0/790/521.htm