今日,科大讯飞在一场盛大的发布会上,点燃了科技界的热情,推出了其创新的星火语音大模型。这款先进的人工智能技术在37个主流语种的语音合成表现上,成功超越了OpenAI备受瞩目的Whisper-large-v3,彰显了科大讯飞在智能语音领域的持续领先地位。

据快科技报道,星火语音大模型不仅在数量上覆盖了首批37个语种,而且在质量上实现了显著提升。与OpenAI的模型相比,星火在这些语种的平均MOS分上取得了0.25的绝对提升,整体评分高达4.5,这意味着其语音的自然度和人类语音的相似度已经达到了一个新的高度,拟人度超过83%。

这一突破性的进展预示着科大讯飞在多语种语音合成技术上的强大实力,为全球用户提供更为真实、流畅的跨语言交流体验。科大讯飞的这一创新,无疑将进一步推动人工智能在语言沟通领域的应用,为全球的信息传播和文化交流打开新的可能。

此次发布再次证明,科大讯飞作为业界的佼佼者,始终致力于技术研发和创新,不断挑战自我,以卓越的技术实力引领智能语音行业的发展。未来,我们可以期待星火语音大模型为全球的语音交互带来更加智能、人性化的变革。

英语如下:

News Title: “iFlytek’s Spark Model Sets a New Benchmark in Intelligent Speech, Outperforming OpenAI’s Whisper in Multilingual Voice Synthesis!”

Keywords: iFlytek, Spark Model, Speech Leadership

News Content: Today, iFlytek ignited the tech world’s passion with the grand launch of its innovative Spark Speech Model. This advanced AI technology has surpassed OpenAI’s prominent Whisper-large-v3 in speech synthesis across 37 major languages, solidifying iFlytek’s ongoing dominance in the smart speech domain.

As reported by Fast Tech, the Spark Speech Model not only covers an initial 37 languages but also delivers remarkable quality improvements. Compared to OpenAI’s model, Spark has achieved an absolute increase of 0.25 in the average MOS scores across these languages, scoring an impressive 4.5 overall. This signifies a new level of naturality and similarity to human speech, with a humanoid score exceeding 83%.

This groundbreaking advancement underscores iFlytek’s prowess in multilingual speech synthesis, paving the way for more authentic and seamless cross-language communication experiences for global users. iFlytek’s innovation is set to further propel AI applications in language interaction, opening new possibilities for global information dissemination and cultural exchange.

Once again, this launch confirms iFlytek’s position as an industry leader, consistently dedicated to technological research and innovation. The company pushes boundaries with its exceptional technical capabilities, guiding the development of the intelligent speech industry. In the future, we can anticipate Spark’s Speech Model bringing even smarter and more human-like transformations to global voice interactions.

【来源】https://news.mydrivers.com/1/961/961266.htm

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注