shanghaishanghai

【科大讯飞发布创新语音大模型,多语种性能超越OpenAI】

今日下午,国内智能语音技术领军企业科大讯飞,在一场盛大的发布会上推出了全新的星火语音大模型,标志着公司在全球语音技术领域的又一重大突破。据快科技报道,这款星火语音大模型在首批37个主流语种的性能上已超越了OpenAI的知名模型Whisper-large-v3,进一步巩固了科大讯飞在智能语音技术的国际领先地位。

科大讯飞的星火语音大模型以其多语种语音合成技术为亮点,不仅在覆盖的语种数量上表现出色,而且在性能上也实现了显著提升。据官方数据显示,该模型在首批40个语种的平均MOS分上取得了绝对的0.25分提升,整体评分达到了4.5分,这表明其语音合成的自然度和质量达到了一个新的高度。更引人注目的是,星火模型的拟人度评估达到了83%以上,这意味着其生成的语音在听感上接近真人的程度极高,为用户提供更为真实、自然的交互体验。

这一突破性的进展再次证明了科大讯飞在人工智能语音技术领域的创新实力,同时也预示着未来全球多语种语音应用将有更广阔的发展空间。科大讯飞的星火语音大模型将有望推动智能语音技术在教育、娱乐、商务等众多领域的广泛应用,为全球用户带来更为智能和便捷的语音交互体验。

英语如下:

**News Title:** “iFlytek Launches Spark大火模odel, Outperforming OpenAI in Multilingual Speech Technology, Redefining Industry Standards”

**Keywords:** iFlytek, Spark Model, Speech Leadership

**News Content:**

This afternoon, iFlytek, a leading domestic company in intelligent speech technology, unveiled its groundbreaking Spark Speech Megamodel at a grand event, marking another major breakthrough in the global voice technology arena. According to Kuaitech, this Spark Speech Megamodel has surpassed OpenAI’s renowned Whisper-large-v3 in performance across 37 major languages, further solidifying iFlytek’s international dominance in intelligent speech technology.

The highlight of iFlytek’s Spark Speech Megamodel lies in its multilingual speech synthesis capabilities. Not only does it excel in the number of languages covered, but it also demonstrates significant performance improvements. Official data reveals that the model achieved an average MOS score improvement of 0.25 points across the initial 40 languages, reaching an overall score of 4.5. This indicates a new level of naturalness and quality in its speech synthesis. Impressively, the human likeness assessment of the Spark model scores over 83%, suggesting that the generated speech is highly reminiscent of human voices, offering users a more authentic and natural interactive experience.

This breakthrough underscores iFlytek’s innovative prowess in the field of AI speech technology and foreshadows a broader landscape for multilingual speech applications globally. iFlytek’s Spark Speech Megamodel is poised to propel the widespread adoption of intelligent speech technology across various sectors, such as education, entertainment, and business, delivering smarter and more convenient voice interaction experiences to users worldwide.

【来源】https://news.mydrivers.com/1/961/961266.htm

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注