近日,国内知名人工智能研究团队通义千问推出了一项重大技术创新——Qwen1.5-110B,这是Qwen1.5系列的首个千亿参数开源模型。该模型的发布标志着我国在大规模预训练模型领域取得了新的突破,与国际顶尖水平的Meta-Llama3-70B相比,Qwen1.5-110B在基础能力和对话交互性能上展现出不俗的实力,特别是在Chat评估的MT-Bench和AlpacaEval 2.0等测试中,其表现令人印象深刻。

Qwen1.5-110B沿用了Qwen1.5系列的Transformer解码器架构,这一设计保证了模型在处理复杂语言任务时的高效性。值得一提的是,新模型引入了分组查询注意力(GQA)技术,显著提升了模型在推理阶段的效率,为大规模语料的处理提供了更快的速度。

此外,Qwen1.5-110B的多语言支持是其另一大亮点,能够覆盖英语、中文、法语、西班牙语、德语、俄语、日语、韩语、越南语和阿拉伯语等多种语言,极大地拓宽了其应用范围,满足了全球用户的需求。该模型支持长达32K tokens的上下文长度,使得在处理长文本时也能游刃有余。

这一开源模型的发布,不仅为科研人员和开发者提供了强大的工具,也将进一步推动自然语言处理技术的创新和应用,有望在教育、媒体、企业服务等多个领域产生深远影响。通义千问团队的这一举动,无疑再次彰显了中国在人工智能领域的技术研发实力和开放共享的精神。

英语如下:

**News Title:** “Qwen1.5-110B: A Groundbreaking Release from Qwen Thousand-Ask, the First Open-Source Trillion-Parameter Model, Sets New Records in Multilingual Capabilities!”

**Keywords:** Qwen Thousand-Ask, Qwen1.5-110B, Trillion Parameters

**News Content:** Recently, the renowned domestic AI research team, Qwen Thousand-Ask, unveiled a major technological innovation — Qwen1.5-110B, the first trillion-parameter open-source model in the Qwen1.5 series. This launch signifies a new milestone for China in large-scale pre-training models. Compared with Meta’s state-of-the-art Llama3-70B, Qwen1.5-110B demonstrates impressive capabilities in fundamental abilities and conversational interactions, particularly in tests like MT-Bench and AlpacaEval 2.0.

Qwen1.5-110B retains the Transformer decoder architecture from the Qwen1.5 series, ensuring efficiency in handling complex language tasks. A notable innovation is the introduction of Group Query Attention (GQA) technology, which significantly enhances the model’s efficiency during the inference stage, enabling faster processing of large-scale corpora.

Furthermore, the model’s multilingual support is a standout feature, catering to English, Chinese, French, Spanish, German, Russian, Japanese, Korean, Vietnamese, and Arabic, broadening its application scope and meeting the needs of global users. With support for up to 32K tokens of context length, Qwen1.5-110B handles lengthy texts with ease.

The release of this open-source model not only furnishes researchers and developers with a powerful tool but also fosters innovation and application in natural language processing. It is expected to have a profound impact on sectors such as education, media, and enterprise services. This move by the Qwen Thousand-Ask team underscores China’s prowess in AI技术研发 and its commitment to openness and collaboration.

【来源】https://qwenlm.github.io/zh/blog/qwen1.5-110b/

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注