在今年的全球图形技术大会(GTC)上,英伟达创始人黄仁勋举行了一场别开生面的圆桌论坛,邀请了Transformer模型的七大原创作者共聚一堂,讨论这一革命性技术的未来。遗憾的是,Niki Parmar因故未能出席此次盛会。这是Transformer论文的作者们首次在公众面前集体亮相,他们的对话充满了洞见和对未来技术的期待。

在对话中,作者们表达了对Transformer模型改进的迫切需求,认为世界期待超越Transformer的新技术,以实现性能的新飞跃。他们坦诚,Transformer最初设计的目的是模拟Token的动态演化,而不仅仅是线性的文本生成。这一目标目前尚未完全达成,他们希望未来的模型能更好地捕捉到文本和代码的逐步演化过程。

讨论中还触及到了模型计算效率的问题。面对像“2+2”这样简单的计算,目前的大模型竟然需要动用万亿参数的资源,这引发了对自适应计算的深入探讨。他们认为,优化计算资源分配,确保在特定问题上恰当地投入计算力,是未来技术发展的重要方向。

此外,作者们还指出,当前的大型语言模型在成本效益上仍有提升空间。尽管目前的价格大约是每百万Token花费1美元,这比购买一本平装书便宜100倍,但他们认为模型的规模和经济性仍有待扩大和优化,以进一步降低使用门槛。

这次论坛揭示了Transformer技术背后的思考与挑战,也预示着人工智能领域的未来将更加注重模型的效率和经济性,以及对复杂问题的智能处理能力。

英语如下:

News Title: “Jensen Huang Engages in Roundtable with Transformer’s Original Authors: Anticipating a New Chapter Beyond Transformer”

Keywords: Transformer authors, GTC conference, Jensen Huang dialogue

News Content: At this year’s Global Graphics Technology Conference (GTC), NVIDIA founder Jensen Huang hosted an innovative roundtable discussion, convening seven of the original authors of the Transformer model to explore the future of this revolutionary technology. Regrettably, Niki Parmar was unable to attend due to unforeseen circumstances. This marked the first public appearance of the Transformer paper’s authors as a collective, offering insights and anticipation for future technological advancements.

During the conversation, the authors expressed the pressing need for improvements to the Transformer model, believing that the world is awaiting新技术超越Transformer to achieve a new leap in performance. They acknowledged that Transformer was initially designed to simulate the dynamic evolution of Tokens, going beyond linear text generation. This objective, they said, remains unfulfilled, and they hope for future models to better capture the progressive evolution of text and code.

The discussion also delved into the issue of model computational efficiency. The fact that current large models require trillions of parameters to handle simple computations like “2+2” sparked a deeper exploration of adaptive computing. They contended that optimizing resource allocation and ensuring appropriate computational power for specific tasks is a key direction for future technological development.

Moreover, the authors pointed out that there is still room for improvement in the cost-effectiveness of current large language models. Despite the current price being roughly $1 per million Tokens, which is 100 times cheaper than a paperback book, they believe that the scalability and economics of the models need to be expanded and optimized to further lower the barrier to entry.

This forum shed light on the thoughts and challenges behind the Transformer technology and foreshadowed a future in artificial intelligence that will increasingly prioritize model efficiency, economic viability, and the ability to intelligently handle complex issues.

【来源】https://new.qq.com/rain/a/20240321A00W5H00

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注