Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

Grok-2, the AI chatbot developed by xAI, has undergone a significant speed boost, delighting its users with faster response times. The enhanced performance is a direct result of a marathon coding session by xAI’s development team, consisting of Igor Babuschkin, Lianmin Zheng, and Saeed Maleki, who spent three consecutive days rewriting the chatbot’s reasoning stack using SGLang. The outcome? Grok-2 mini, a lightweight variant of the chatbot, now operates at twice its original speed.

xAI recently introduced Grok-2 to the market, offering the service on platform X for a monthly fee of $8. Users who have been using Grok-2 have indeed noticed the improvement, and their observations are not mere illusions. Both Grok-2 and its streamlined counterpart, Grok-2 mini, have shown marked increases in their abilities to analyze information and generate replies.

In an update to the Lmsys Chatbot Arena, an independent third-party platform that benchmarks AI model performance, Grok-2’s main model scored an impressive 1293 points out of 6686 votes. This achievement has propelled Grok-2 to the second spot in the global rankings, tying with Google’s Gemini-1.5 Pro and trailing only OpenAI’s latest ChatGPT-4o. Notably, Grok-2 surpassed GPT-4o, which was released in May 2024.

Grok-2 mini also benefited from the optimization, climbing to the fifth position with a score of 1268 in the Arena rankings, behind GPT-4o mini and Claude 3.5 Sonnet. The team’s dedication was acknowledged by none other than the company’s boss, Elon Musk, who sent a congratulatory message.

According to Babuschkin’s response on platform X, the primary advantage of using Grok-2 mini over the full Grok-2 model lies in its enhanced speed. Babuschkin further assured users that xAI plans to increase Grok-2-mini’s processing speed, making it an even more appealing option for those seeking high performance with low computational overheads. He also hinted at upcoming API improvements.

The key to this remarkable acceleration lies in SGLang, an open-source (Apache 2.0 licensed) system designed to execute complex language model programs efficiently. Developed by researchers from the University of California, Berkeley, the University of California, San Diego, and Carnegie Mellon University, SGLang enhances interaction with large language models (LLMs) by unifying the backend runtime system and frontend language, thereby improving speed and control.

SGLang currently supports models like Llama, Mistral, and LLaVA, and is compatible with API-based open models, including OpenAI’s GPT-4. Its ability to optimize execution through automatic caching and parallel processing makes it a powerful tool for developers working with large-scale language models. The recent release of SGLang Runtime v0.2, a universal service engine for LLMs and VLMs, has shown superior throughput and latency performance compared to vLLM and TensorRT-LLM, particularly in scenarios involving Llama series models.

In conclusion, the dedication and innovative work of xAI’s development team have led to a significant leap forward for Grok-2 and Grok-2 mini. By harnessing the power of SGLang, these chatbots are not only providing faster service but also demonstrating the potential for further advancements in AI technology, pushing the boundaries of speed and efficiency in the realm of conversational AI.

【source】https://www.jiqizhixin.com/articles/2024-08-26-7

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注