亚马逊人工智能研究团队最近宣布,他们开发出了一款名为“可扩展流式文本转语音模型”(BASE TTS)的文本转语音系统,这是迄今为止参数最多、训练数据集最大的文本转语音模型。该模型拥有高达9.8亿个参数,使用了10万小时的英语语音录音进行训练。研究人员在arXiv预印本服务器上发表了详细介绍该模型的论文。
这项突破标志着亚马逊在语音合成技术领域的进一步发展。该公司一直在人工智能语音助手Alexa等产品中应用语音合成技术。这款新的文本转语音模型具有更高的质量和更好的可扩展性,有望为Alexa等产品带来更加逼真的语音体验。
Title: Amazon unveils the largest text-to-speech model in history
Keywords: Amazon, text to speech, artificial intelligence
News content: Amazon’s artificial intelligence research team recently announced that they have developed a text-to-speech system called “Scalable Streaming Text-to-Speech Model” (BASE TTS), which is the text-to-speech model with the most parameters and the largest training dataset so far. The model has as many as 980 million parameters and was trained on 100,000 hours of English speech recordings. Researchers published a paper detailing the model on the arXiv preprint server.
This breakthrough marks Amazon’s further development in speech synthesis technology. The company has been applying speech synthesis technology to products like the Alexa voice assistant. This new text-to-speech model has higher quality and better scalability, which is expected to bring more realistic voice experiences to products like Alexa.
【来源】https://www.ithome.com/0/750/680.htm
Views: 0