人工智能芯片领域再起波澜,由前谷歌TPU团队打造的初创公司Groq,近日发布了其最新推理加速方案。据该公司宣称,这一方案的推理速度可以达到惊人的每秒500 tokens,较之英伟达GPU的性能提高了10倍,同时成本仅为其十分之一。这使得任何大模型理论上均可实现快速部署。
Groq的这一方案支持包括Mixtral 8x7B SMoE、Llama 2的7B和70B在内的三种模型,并已提供可直接体验的Demo。这意味着,对于那些追求高性能计算,尤其是需要大规模并行处理能力的企业与研究机构,Groq的方案无疑提供了全新的选择。
在人工智能快速发展的当下,AI芯片的需求日益旺盛。Groq的这一突破,不仅展现了其技术实力,更可能对未来市场格局产生深远影响。
英文标题:Groq Sets New AI Chip Benchmark: 500 Tokens per Second Inference Speed
英文关键词:AI Chip, Inference Speed, Groq
英文新闻内容:
A new wave has hit the field of artificial intelligence chips with the recent release of a new inference acceleration solution by Groq, a startup founded by the original Google TPU team. The company claims that this solution can achieve an astonishing inference speed of 500 tokens per second, which is 10 times faster than Nvidia GPUs while also cutting costs to just one-tenth of that. This enables rapid deployment of any large model in theory.
This solution by Groq supports three models including Mixtral 8x7B SMoE, Llama 2’s 7B and 70B, and a demo for direct experience is already available. For those enterprises and research institutions that pursue high-performance computing, especially those in need of large-scale parallel processing capabilities, Groq’s solution undoubtedly offers a new choice.
In the context of the rapid development of artificial intelligence, the demand for AI chips is growing. This breakthrough by Groq not only showcases its technical strength but may also have a profound impact on the future market landscape.
【来源】https://mp.weixin.qq.com/s/tMDJP234MksYeUu_RUPzBA
Views: 1