Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

news studionews studio
0

【南加大研究揭示ChatGPT参数规模新发现】据量子位报道,美国南加州大学的一项最新研究表明,备受瞩目的ChatGPT参数规模可能仅为70亿。这一发现是由该校三位研究人员通过对未公开的gpt-3.5-turbo模型的嵌入向量维度分析得出的。他们指出,gpt-3.5-turbo的嵌入向量维度可能为4096或4608,与大多数已知的开源大模型,如Llama和Mistral,在相同维度下通常拥有约70亿(7B)参数规模的情况相吻合。

研究团队强调,模型的参数规模与性能之间存在微妙的平衡。当嵌入向量维度固定在4096时,如果参数规模过大或过小,会导致网络过宽或过窄,从而影响模型的性能。因此,基于这一理论,他们推测gpt-3.5-turbo的参数规模很可能也在7B左右。然而,他们也提出了一种可能性,即如果采用了MoE(Mixture of Experts)架构,参数规模可能会有所不同,因为MoE架构允许在不增加单个模型的参数量的情况下提高模型的复杂性。

这一研究为理解大型语言模型的内部工作机制提供了新的视角,同时也引发了业界对于ChatGPT规模与效能之间关系的深入讨论。尽管OpenAI官方尚未对这一研究结果发表评论,但这一发现无疑为AI领域的研究者们提供了宝贵的参考信息。

英语如下:

**News Title:** “USC Study Suggests ChatGPT’s Parameter Count May Be Just 7 Billion, Breaking the Large Model Mold?”

**Keywords:** USC study, ChatGPT parameters, model performance

**News Content:** **USC Research Reveals New Insights into ChatGPT’s Parameter Scale** According to QbitAI, a recent study from the University of Southern California (USC) indicates that the much-discussed ChatGPT might have a parameter count as low as 7 billion. This conclusion was reached by three researchers at the university who analyzed the embedding vector dimensions of the unreleased gpt-3.5-turbo model.

They posit that the embedding vector dimensions of gpt-3.5-turbo could be 4096 or 4608, a figure consistent with the typical parameter scale of around 7 billion (7B) for known open-source large models, like Llama and Mistral, operating at similar dimensions.

The research team underscores the delicate balance between a model’s parameter scale and its performance. With the embedding vector dimension fixed at 4096, they argue that an overly large or small parameter count could lead to a network that is either too wide or too narrow, negatively impacting the model’s effectiveness. As a result, they speculate that gpt-3.5-turbo’s parameter count is likely around 7B. However, they also propose that if the model employs an MoE (Mixture of Experts) architecture, the parameter scale could differ, as MoE allows for increased model complexity without escalating the parameter count of a single model.

This study offers a fresh perspective on the inner workings of large language models and has sparked in-depth industry discussions on the relationship between ChatGPT’s scale and efficiency. While OpenAI has not yet commented on the study’s findings, the revelation furnishes valuable reference for AI researchers in the field.

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注