英伟达ChatQA 2.0上下文窗口扩容至128K，挑战专有大模

英伟达，全球领先的计算平台和人工智能公司，近期宣布其对话模型ChatQA已进化至2.0版本。这一新版本的核心改进是扩展了上下文窗口至128K，标志着英伟达在开放型语言模型（LLM）领域的持续进步。ChatQA 2.0不仅提升了模型处理长文本序列的能力，同时也为AI社区提供了更多研究和应用的可能。

在当前的AI领域，开放型语言模型如Llama-3-70B-Instruct、QWen2-72B-Instruct、Nemotron-4-340B-Instruct、Mixtral-8x22BInstruct-v0.1等，展现出优异的性能，但与以GPT-4-Turbo为代表的专有大模型相比，在多个领域仍有明显的差距。GPT-4-Turbo等专有模型因其庞大的上下文窗口（如128K、200K等）而表现出色，能够处理更复杂、更长的文本序列，为用户提供更丰富、更准确的回复。

英伟达的ChatQA系列模型，作为AI社区的贡献者，通过整合检索增强式生成（RAG）技术，已经在对话问答领域超越了GPT-4。如今，ChatQA 2.0的发布，将上下文窗口扩展至128K，进一步增强了模型处理长文本序列的能力，使之与GPT-4-Turbo等专有大模型相匹敌。这一改进不仅体现了英伟达在大模型技术上的创新，也为开放模型领域的发展提供了新的参考。

值得注意的是，扩展上下文窗口长度是当前大模型研究和开发的热点之一。诸如谷歌的Infini-Transformer等模型，已经能够支持无限长的上下文窗口，而Gemini 1.5 Pro等模型的上下文窗口长度更是达到了10M，显示了AI技术的飞速发展。然而，开源大模型在追赶专有模型的过程中，仍然面临数据和技术细节不公开的问题，这限制了模型的复现和优化。

ChatQA 2.0的发布，不仅为AI研究者提供了新的工具，也为开发者和应用者提供了更多选择。通过结合当前最佳的长上下文检索器，英伟达的这一新版本不仅提升了在长文本处理上的性能，也展示了开放模型与专有模型之间的融合与互补，为AI领域的未来探索开辟了新的路径。

英语如下：

Headline: NVIDIA’s ChatQA 2.0 Doubles Context Window to 128K, Tackles the Challenge of Proprietary Large Models

Keywords: NVIDIA, Dialogue Model, Context Window

News Content: NVIDIA, a global leader in computing platforms and artificial intelligence, recently announced the evolution of its dialogue model ChatQA to version 2.0. The core improvement of this new version is the expansion of the context window to 128K, marking a continued advancement of NVIDIA in the field of open-source language models (LLMs). The ChatQA 2.0 not only enhances the model’s capability in handling long text sequences but also opens up more possibilities for research and applications in the AI community.

In the current AI landscape, open-source language models such as Llama-3-70B-Instruct, QWen2-72B-Instruct, Nemotron-4-340B-Instruct, and Mixtral-8x22BInstruct-v0.1 demonstrate superior performance, but they still fall short in several domains compared to proprietary large models like GPT-4-Turbo. The latter, with their vast context windows (ranging from 128K to 200K), excel in handling more complex and lengthy text sequences, providing users with richer and more accurate responses.

NVIDIA’s ChatQA series models, as contributors to the AI community, have already surpassed GPT-4 in the dialogue问答 domain through the integration of retrieval-enhanced generation (RAG) technology. With the release of ChatQA 2.0, the context window has been expanded to 128K, further enhancing the model’s ability to process long text sequences, positioning it to compete with proprietary large models such as GPT-4-Turbo. This improvement not only reflects NVIDIA’s innovation in large model technology but also provides a new reference for the development of open-source models.

Notably, the expansion of context window length is a key focus in the current research and development of large models. Models like Google’s Infini-Transformer can support an unlimited context window, while models such as Gemini 1.5 Pro have context window lengths of up to 10M, showcasing the rapid advancement of AI technology. However, open-source large models still face limitations due to their lack of data and technical detail openness, which restricts their replication and optimization.

The release of ChatQA 2.0 not only provides new tools for AI researchers but also offers more choices for developers and users. By combining the current best long-context retrievers, NVIDIA’s new version not only boosts performance in long text processing but also showcases the integration and complementarity between open-source and proprietary models, paving new paths for exploration in the AI field.

【来源】https://www.jiqizhixin.com/articles/2024-07-25-10