南加州大学的研究团队最近发表了一项研究,揭示了聊天机器人ChatGPT的实际参数规模可能只有70亿,远低于之前广泛猜测的1750亿。这项研究由三位作者共同完成,他们成功破解了GPT-3.5-turbo的嵌入向量维度,发现其值为4096或4608。这一发现引起了业界的广泛关注,因为它可能颠覆了人们对大型语言模型规模的认知。

研究团队指出,当嵌入向量维度为4096时,几乎所有已知的开源大型模型,如Llama和Mistral,都具有约70亿参数规模。如果这一比例不成立,模型的宽度或窄度过大,这将不利于模型的性能。因此,南加州大学的专家推测,GPT-3.5-turbo的参数规模可能在70亿左右,除非它采用了MoE(Mixture of Experts)架构。

这一研究成果可能会对人工智能研究和应用产生重大影响,因为它可能会促使人们重新思考大型语言模型的设计和优化。此外,这一发现也可能对ChatGPT的后续发展和应用产生深远的影响。

英文翻译内容:
Title: USC Study Reveals True Parameter Count of ChatGPT
Keywords: AI, Model Size, GPT-3.5-turbo, Performance
News content:
A recent study by researchers at the University of Southern California has uncovered that the actual parameter count of the chatbot ChatGPT may be as low as 7 billion, significantly less than the previously estimated 175 billion. The study, conducted by a team of three authors, successfully decoded the embedding vector dimension of GPT-3.5-turbo, finding it to be either 4096 or 4608. This revelation has stirred significant interest in the industry, as it may challenge conventional understanding of the scale of large language models.

The team notes that when the embedding vector dimension is 4096, nearly all known open-source large models, such as Llama and Mistral, have a parameter count of approximately 7 billion. If this ratio does not hold, the model would be either too wide or too narrow, which is detrimental to its performance. Therefore, experts from USC speculate that the parameter count of GPT-3.5-turbo is around 7 billion, unless it employs the MoE (Mixture of Experts) architecture.

This research could have significant implications for both AI research and application, as it may lead to a reevaluation of the design and optimization of large language models. Moreover, this discovery may also have far-reaching effects on the future development and application of ChatGPT.

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 1

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注