AI芯片初创公司Groq近日推出了一款突破性的AI推理芯片,该芯片每秒可生成500 tokens,创下了当前大模型推理速度的新纪录。Groq团队的核心成员曾是谷歌TPU团队的骨干,他们基于自研芯片开发了一套高效的推理加速方案。据Groq公司介绍,与英伟达的GPU相比,新芯片的推理速度提高了10倍,而成本却仅为后者的十分之一。这一成就意味着任何大模型都可以在Groq的芯片上得到高效部署。
目前,Groq的芯片已经能够支持Mixtral 8x7B SMoE、Llama 2的7B和70B这三种模型,并且提供了可以直接体验的Demo。这一进展对于人工智能行业来说是一个重要的里程碑,它预示着更快的推理速度和更低的成本,将加速AI技术的普及和应用。
英文标题:Groq Unveils Ultra-Fast AI Inference Chip
英文关键词:Groq, AI Chip, Inference Acceleration
英文新闻内容:
Groq, an AI chip startup, has recently unveiled an innovative AI inference chip capable of generating 500 tokens per second, setting a new record in large model inference speed. The core members of Groq’s team were once part of Google’s TPU team, and they have developed a highly efficient inference acceleration solution based on their proprietary chip. Groq claims that their chip’s inference speed is ten times faster than NVIDIA’s GPU, while its cost is only one-tenth. This breakthrough means that any large model can be efficiently deployed on Groq’s chip.
At present, Groq’s chip supports the Mixtral 8x7B SMoE, Llama 2’s 7B and 70B models, and offers a demo that can be directly experienced. This progress is a significant milestone for the AI industry, as it promises faster inference speeds and lower costs, accelerating the adoption and application of AI technology.
“`
【来源】https://mp.weixin.qq.com/s/tMDJP234MksYeUu_RUPzBA
Views: 1