ChatGPT新功能：高级语音模式正式上线

随着人工智能技术的不断进步，OpenAI 近日宣布其高级语音模式 ChatGPT 已正式上线，向一小部分 ChatGPT Plus 用户开放。这一功能允许用户首次体验到 GPT-4o 的超现实音频响应，为用户提供更加直观和个性化的交互体验。

在今年的春季新品发布会上，OpenAI 展示了新一代旗舰生成模型 GPT-4o，并带来了桌面 App，展示了其在语音交互方面的最新能力。用户在 ChatGPT 应用程序中将收到提醒，并收到一封电子邮件，其中包含如何使用该功能的说明。OpenAI 表示，该功能将在 2024 年秋季逐步向所有 Plus 用户推出。

一些用户已经晒出了使用高级语音模式的效果，表示 ChatGPT 不仅可以提供笑声陪伴，还能在讲故事时创造背景音乐，并且适用于多种语言，包括法语、西班牙语和乌尔都语。然而，中文表达仍然不够地道，有时会显得像是正在学习中文的「歪果仁」。

OpenAI 表示，高级语音模式与 ChatGPT 目前提供的语音模式不同。GPT-4o 是一个多模态模型，能够独立处理语音转换和文本转换的任务，从而减少对话延迟。此外，GPT-4o 能够感知用户声音中的情绪语调，包括悲伤、兴奋等。

尽管高级语音模式具有很高的技术含金量，但 OpenAI 也面临了一些争议。斯嘉丽・约翰逊曾表示拒绝 OpenAI CEO 山姆・奥特曼使用她的声音的请求，并在看到 GPT-4o 的演示后，聘请了法律顾问为自己的声音辩护。OpenAI 否认使用了斯嘉丽・约翰逊的声音，并删除了演示中的声音。

6 月，OpenAI 表示将推迟发布高级语音模式，以改进其安全措施。在长时间的等待后，高级语音模式终于与大家见面。OpenAI 表示，此次推出的高级语音模式将仅限于 ChatGPT 与付费配音演员合作，制作了四种预设语音：Juniper、Breeze、Cove 和 Ember。OpenAI 发言人 Lindsay McCallum 表示，ChatGPT 不能冒用他人的声音，包括个人和公众人物的声音，并且会阻止与这些预设声音之一不同的输出，以避免 Deepfake 争议。

随着高级语音模式的推出，ChatGPT 正在逐步向更自然、更真实的交互方式迈进，为用户提供更加丰富和个性化的服务。

英语如下：

News Title: “ChatGPT Introduces New Feature: Advanced Voice Mode Officially Launched”

Keywords: Voice Mode, OpenAI, Artificial Intelligence

News Content:

As artificial intelligence technology continues to advance, OpenAI recently announced the official launch of its advanced voice mode for ChatGPT, which is now available to a select group of ChatGPT Plus users. This feature allows users to experience the surreal audio responses of GPT-4o for the first time, providing a more intuitive and personalized interactive experience.

At this year’s spring product launch event, OpenAI showcased the latest capabilities of its flagship generation model, GPT-4o, and introduced a desktop app that demonstrated its prowess in voice interaction. Users of the ChatGPT application will be notified via a reminder and an email containing instructions on how to use the feature. OpenAI plans to gradually roll out this feature to all Plus users by the fall of 2024.

Some users have shared their experiences with the advanced voice mode, noting that ChatGPT not only provides laughter but also creates background music while telling stories, and it is applicable to multiple languages, including French, Spanish, and Urdu. However, the Chinese expressions still lack authenticity, sometimes sounding like a foreigner learning Chinese.

OpenAI stated that the advanced voice mode is different from the current voice mode ChatGPT offers. GPT-4o is a multimodal model capable of handling voice conversion and text conversion tasks independently, thus reducing conversation latency. Additionally, GPT-4o can sense the emotional tone of the user’s voice, including sadness and excitement.

Despite the high technical value of the advanced voice mode, OpenAI has also faced controversy. Scarlett Johansson has expressed her refusal to allow OpenAI CEO Sam Altman to use her voice and has hired a legal representative to defend her voice after seeing a demonstration of GPT-4o. OpenAI denies using Scarlett Johansson’s voice and has removed the voice from the demonstration.

In June, OpenAI announced a postponement of the release of the advanced voice mode to improve its safety measures. After a long wait, the advanced voice mode finally made its debut. OpenAI stated that the advanced voice mode will only be available through collaborations with paid voice actors, offering four preset voices: Juniper, Breeze, Cove, and Ember. OpenAI spokesperson Lindsay McCallum said that ChatGPT cannot impersonate the voices of others, including individuals and public figures, and will block any output different from these preset voices to avoid Deepfake controversies.

With the launch of the advanced voice mode, ChatGPT is gradually moving towards a more natural and realistic form of interaction, offering its users richer and more personalized services.

【来源】https://www.jiqizhixin.com/articles/2024-07-31-10