Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

圣何塞,加利福尼亚州 – 在备受瞩目的英伟达GTC大会上,首席执行官黄仁勋再次引爆科技界,发布了新一代核弹级AI芯片Blackwell Ultra(GB300)。这款芯片不仅在性能上实现了巨大飞跃,更预示着人工智能竞争的新格局——推理成本和效率将成为关键。

Blackwell Ultra:性能怪兽横空出世

Blackwell Ultra(GB300)作为去年“全球最强AI芯片”B200的继任者,在性能上再次实现了突破。这款芯片配备了20TB HBM3内存和40TB快速内存,支持14.4TB/s的CX8带宽,为AI推理、机器人训练及自动驾驶等领域提供了强大的算力支持。具体参数亮点如下:

  • FP4推理性能: 1.1 ExaFLOPS(每秒百亿亿次浮点运算)
  • FP8训练性能: 0.36 ExaFLOPS
  • HBM3内存: 20TB
  • 快速内存: 40TB
  • CX8带宽: 14.4 TB/s

英伟达还推出了基于Blackwell Ultra的AI PC,包括DGX Station和DGX Spark。DGX Station配备784GB系统内存和800Gbps ConnectX-8 SuperNIC网络,提供20 petaflops的AI性能。DGX Spark则搭载专为桌面优化的GB10 Grace Blackwell超级芯片,每秒可提供高达1000万亿次AI计算操作,适用于最新AI推理模型的微调和推理。

DeepSeek:隐藏的赢家?

本次发布会,除了性能强大的硬件,更值得关注的是英伟达对AI推理成本效率的强调。随着智能体AI(Agentic AI)和推理能力的提升,所需的计算量呈指数级增长。英伟达正在转型为AI工厂,旨在以超越人类的速度学习和推理。

在这种背景下,DeepSeek等专注于模型推理效率优化的企业,或将成为最大的受益者。未来的AI竞争,不再是谁的模型更大,而在于谁的模型具有最低的推理成本和更高的推理效率。

Rubin:下一代AI芯片已在路上

英伟达并未止步于Blackwell Ultra,而是提前官宣了下一代AI芯片Rubin,预计将于2026年下半年推出。Rubin将采用HBM4内存,并引入名为Grace CPU的继任者——Veru,包含88个定制的Arm核心。英伟达表示,定制的Vera设计将比去年Grace Blackwell芯片中使用的CPU速度提升一倍。

Rubin在推理任务中的算力可达50 petaflops,是Blackwell的两倍以上。此外,Rubin还支持高达288GB的HBM4内存,这也是AI开发者关注的核心规格之一。

AI计算新纪元:推理成本与效率至关重要

黄仁勋的GTC大会不仅展示了英伟达在AI芯片领域的强大实力,更揭示了AI计算发展的新趋势。在AI应用日益普及的今天,推理成本和效率将成为决定企业竞争力的关键因素。英伟达正在通过不断的技术创新,推动AI计算进入一个更加高效、经济的新纪元。

参考文献:

  • APPSO. (2024). 刚刚,黄仁勋甩出三代核弹AI芯片!个人超算每秒运算1000万亿次,DeepSeek成最大赢家. Retrieved from [请在此处添加原始文章链接]


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注