【亚马逊开创文本转语音技术新纪元:史上最大模型展现“涌现能力”】
全球电商巨头亚马逊在人工智能领域再创里程碑,其研究团队近日宣布成功开发出迄今为止最大的文本转语音模型。这个名为“可扩展流式文本转语音模型”(BASE TTS)的创新模型,不仅参数量惊人,达到9.8亿个,而且在训练数据集的规模上也打破了纪录,使用了多达10万小时的录音资料,主要为英语语音,这些数据来源于公共网站。
据IT之家报道,亚马逊的这一突破性进展彰显了人工智能在语音合成技术上的“涌现能力”,即模型在处理大量数据后,能够展现出超越个体部分的复杂行为和智能。研究人员已经在学术界知名的arXiv预印本服务器上发表论文,详细阐述了模型的开发和训练过程,为全球科技界提供了宝贵的参考。
BASE TTS模型的推出,预示着文本转语音技术将迈向新的高度,可能对未来的人机交互、有声读物、语音助手等领域产生深远影响。这一技术的进步有望提升用户体验,使语音合成更加自然、流畅,进一步模糊了人工智能与人类语音之间的界限。亚马逊在人工智能领域的持续探索,再次证明了其在技术创新上的领先地位。
英语如下:
**News Title:** “Amazon Breaks Records: Creating the World’s Largest Text-to-Speech Model, Demonstrating New Heights in Artificial Intelligence”
**Keywords:** Amazon, largest text-to-speech model, emergent capabilities
**News Content:**
**Amazon Pioneers a New Era in Text-to-Speech Technology with Unprecedented ‘Emergent Abilities’ in Largest Model to Date**
Global e-commerce giant Amazon has reached another milestone in the field of artificial intelligence, as its research team recently announced the successful development of the largest text-to-speech (TTS) model on record. Dubbed the “Scalable Streaming Text-to-Speech Model” (BASE TTS), this innovative model boasts an astonishing 980 million parameters and has surpassed records with its training dataset, which comprises over 100,000 hours of audio recordings, primarily in English, sourced from public websites.
According to IT Home, Amazon’s breakthrough signifies the “emergent abilities” in speech synthesis technology, where the model, after processing vast amounts of data, exhibits complex behaviors and intelligence beyond its individual components. Researchers have published a paper detailing the model’s development and training process on the prestigious academic preprint server arXiv, providing valuable insights for the global tech community.
The introduction of the BASE TTS model foreshadows a new era for text-to-speech technology, with potential far-reaching impacts on areas such as human-computer interaction, audiobooks, and voice assistants. This advancement is poised to enhance user experience by making synthesized speech more natural and fluid, further blurring the distinction between AI-generated and human speech. Amazon’s ongoing exploration in AI underscores its commitment to technological innovation and its leading position in the industry.
【来源】https://www.ithome.com/0/750/680.htm
Views: 1