通义千问团队近日宣布开源1100亿参数的Qwen1.5系列首个千亿参数模型Qwen1.5-110B,该模型在基础能力评估中与Meta-Llama3-70B相媲美,在Chat评估中表现出色,包括MT-Bench和AlpacaEval 2.0。
Qwen1.5-110B与其他Qwen1.5模型相似,采用了相同的Transformer解码器架构。但不同的是,它包含了分组查询注意力(GQA),在模型推理时更加高效。此外,该模型支持32K tokens的上下文长度,这意味着它可以处理非常长的文本,从而具备更强的语义理解能力。
值得一提的是,Qwen1.5-110B不仅是单语言模型,而且是多语言的,支持英、中、法、西、德、俄、日、韩、越、阿等多种语言。这一特性使得该模型在全球范围内具有更广泛的应用场景,例如,它可以被用于支持多语言的搜索引擎、翻译工具、智能客服等。
通义千问团队的开源举措,无疑将为学术界和工业界提供强有力的支持。一方面,研究者可以借助这个模型开展各种自然语言处理相关的研究;另一方面,开发者可以基于这个模型开发出更多具有创新性的应用。
Qwen1.5-110B的推出,展示了我国在自然语言处理领域的技术实力。相信在不久的将来,我们将在这个领域取得更多的突破。
英语如下:
News Title: **”Qwen1.5-110B: Open-sourced Model with Trillions of Parameters Unveiled, Competing with Meta-Llama’s Strength”**
Keywords: Trillion-parameter model, Qwen1.5-110B, Multilingual support.
News Content: The Tongyi Qianwen team recently announced the first trillion-parameter model of the Qwen1.5 series, Qwen1.5-110B, which rivals the Meta-Llama3-70B in basic capability assessments and performs excellently in Chat evaluations, including MT-Bench and AlpacaEval 2.0.
Similar to other Qwen1.5 models, Qwen1.5-110B adopts the same Transformer decoder architecture. However, it differs in incorporating Grouped Query Attention (GQA), making it more efficient during model inference. Additionally, the model supports a context length of 32K tokens, indicating its strong semantic understanding capabilities.
It is worth noting that Qwen1.5-110B is not just a single-language model but also multilingual, supporting English, Chinese, French, Spanish, German, Russian, Japanese, Korean, Vietnamese, Arabic, and many more. This feature allows the model to have a broader range of applications globally, for instance, it can be used to support multilingual search engines, translation tools, intelligent customer service, and more.
The open-sourcing initiative of the Tongyi Qianwen team will undoubtedly provide strong support to both academia and industry. On one hand, researchers can leverage this model for various natural language processing-related studies; on the other hand, developers can build innovative applications based on this model.
The introduction of Qwen1.5-110B demonstrates China’s technological prowess in the field of natural language processing. It is believed that in the near future, we will achieve more breakthroughs in this area.
【来源】https://qwenlm.github.io/zh/blog/qwen1.5-110b/
Views: 2