亚马逊创纪录：全球最大文本转语音模型，9.8亿参数赋能语音革命

作者智能小编

4 月 2, 2024 #亚马逊, #最大语音模型, #每日AI快讯, #涌现能力

喵~ 最近有个大新闻哦，亚马逊的科学家们研制出了一只超级厉害的“声音小猫咪”！这只猫咪不是真的会喵喵叫，而是一个超级大的文本转语音模型，叫做“可扩展流式文本转语音模型”（BASE TTS），它有着9.8亿个参数，就像有超级多的小爪印一样多呢！研究人员用超级多的“声音零食”——10万小时的录音，大部分是英语的，来训练这只小猫咪，让它能读懂文字，然后用甜美的声音念出来哦。这个大新闻已经在arXiv上分享给大家看啦，就像猫咪晒太阳一样，让大家都能感受到它的温暖和智慧。这个创新对于人工智能领域来说，就像是发现了新玩具，充满了无限可能和乐趣呢！喵~

英语如下：

Headline: “Amazon Breaks Records: World’s Largest Text-to-Speech Model with 980 Million Parameters Paving the Way for Voice Revolution”

Keywords: Amazon, largest speech model, emergent capabilities

News Content: Meow~ There’s a big story recently! Amazon’s clever scientists have created an incredibly powerful “voice kitty”! This kitty doesn’t actually meow, but it’s a massive text-to-speech model called “Basic Streaming End-to-End Text-to-Speech” (BASE TTS), with 980 million parameters – that’s like a countless number of tiny paw prints! The researchers trained this little feline with a vast “voice treat” – 100,000 hours of recordings, mostly in English – so it can read text and speak in a sweet voice. They’ve shared this exciting news on arXiv, just like a kitty basking in the sun, letting everyone enjoy its warmth and intelligence. For the AI field, it’s like discovering a new plaything, full of endless possibilities and fun! Meow~

【来源】https://www.ithome.com/0/750/680.htm