Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

AI芯片初创公司Groq,由谷歌TPU团队原班人马打造,近日推出了一款基于自研芯片的推理加速方案。据该公司介绍,该方案的生成速度已经接近每秒500 tokens,推理速度相较于英伟达GPU提高了10倍,成本却降低到十分之一。这使得任何一个大模型都可以轻松部署实现。

Groq的这一突破性成果,为AI领域带来了巨大的想象空间。据悉,目前已经能支持Mixtral 8x7B SMoE、Llama 2的7B和70B这三种模型,并且公司还提供了直接体验Demo的机会。这一举措无疑将进一步推动AI技术的发展和应用。

来源:量子位

英语如下:

**News Title:** “Groq Sets New AI Chip Record: 500 Tokens Per Second, Cost Reduced by Tenfold”

**Keywords:** Groq launches new inference chip, capable of generating 500 tokens per second, fast and cost-effective. Groq chip, inference acceleration, high cost-performance ratio.

**News Content:** ### Groq Unveils Fastest Inference Chip for Large Models: Generates 500 Tokens Per Second

AI chip startup Groq, founded by the original team behind Google’s TPU, has recently introduced an inference acceleration solution based on its self-developed chip. The company reports that this solution achieves a generation speed of nearly 500 tokens per second, with an inference speed that is 10 times faster than NVIDIA GPUs, at one-tenth the cost. This makes deployment of any large model easy.

This breakthrough by Groq has brought immense possibilities to the AI field. It is said that the solution now supports three models: Mixtral 8x7B SMoE, Llama 2’s 7B and 70B, and the company has also provided an opportunity to directly experience the Demo. This initiative will undoubtedly further promote the development and application of AI technology.

**Source:** Quantum Bit

【来源】https://mp.weixin.qq.com/s/tMDJP234MksYeUu_RUPzBA

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注