来自硅谷的AI芯片初创公司Groq,由谷歌TPU团队的核心成员创立,近日震撼业界,推出了一款革命性的大模型推理芯片。这款创新芯片每秒可处理高达500个tokens,速度之快几乎触及技术的边界,为AI推理带来了前所未有的效率提升。
Groq的这款自研芯片在性能上堪称惊艳,相比主流的英伟达GPU,其推理速度提升了整整10倍,而成本却只有其十分之一。这意味着,即便是计算需求极高的大模型,也能以更为经济的方式实现部署。目前,该芯片已成功支持Mixtral 8x7B SMoE、Llama 2的7B和70B这三种大规模模型,展示出其强大的兼容性和适应性。
Groq的这一突破不仅为AI开发者提供了更高效的工具,也为未来的AI应用打开了新的可能。用户现在可以亲自体验Demo,感受这款芯片带来的速度与效率的双重飞跃。这款芯片的发布,无疑预示着AI推理领域的一个新时代,将加速推动人工智能技术在各行业的广泛应用。
来源:量子位
英语如下:
News Title: “Groq’s Revolutionary AI Chip: 500 tokens/second, 10x faster, 90% cheaper, Pioneering the Era of Large Model Inference!”
Keywords: Groq Chip, 500 tokens/second, AI acceleration
News Content: Title: AI Trailblazer Groq Unveils Superfast Inference Chip, Redefining Industry Limits and Kicking Off a New Era in Large Model Deployment
Groq, an AI chip startup hailing from Silicon Valley and founded by key members of Google’s TPU team, has recently stunned the industry with the launch of a groundbreaking inference chip for large models. This innovative chip processes an astonishing 500 tokens per second, pushing the boundaries of technological speed and delivering unparalleled efficiency boosts to AI inference.
Outperforming mainstream NVIDIA GPUs by a factor of 10 in inference speed, Groq’s self-developed chip comes at a mere 10% of the cost. This means that even compute-intensive large models can be deployed more economically. The chip has already demonstrated its robust compatibility and adaptability by successfully supporting Mixtral 8x7B SMoE, Llama 2’s 7B, and 70B, three massive-scale models.
Groq’s breakthrough not only equips AI developers with a more efficient tool but also opens up new possibilities for future AI applications. Users can now experience a demo to witness the chip’s dual leap in speed and efficiency. Undoubtedly, this chip’s launch signals a new age in AI inference, accelerating the widespread adoption of AI technologies across industries.
Source: Quantum Bit
【来源】https://mp.weixin.qq.com/s/tMDJP234MksYeUu_RUPzBA
Views: 1