Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

**南加州大学研究揭示ChatGPT参数规模可能仅为70亿**

据量子位报道,南加州大学的一组研究人员近期进行了一项深入研究,对OpenAI未公开的GPT-3.5-turbo模型的参数规模进行了推测。研究团队的三位作者成功破解了该模型的嵌入向量维度,确定其为4096或4608。这一发现对于理解大型语言模型的内部构造具有重要意义。

根据已知的开源大模型,如Llama和Mistral,当嵌入向量维度为4096时,其参数规模通常在70亿左右。考虑到模型性能与参数规模之间的关系,网络过宽或过窄都会对模型的效率和准确性产生负面影响。因此,南加大团队基于这一规律推测,GPT-3.5-turbo的参数规模可能也接近70亿,除非采用了特殊的模型架构,如混合专家(MoE)架构,这可能会改变参数规模与性能之间的常规比例关系。

这项研究为理解GPT-3.5-turbo的性能提供了新的视角,同时也对大型语言模型的优化和设计提供了理论依据。尽管OpenAI尚未正式公布这一数据,但南加州大学的研究成果无疑为业界提供了有价值的参考信息。

英语如下:

**News Title:** “USC Study Uncovers ChatGPT Parameter Mystery: Possibly Just 7 Billion, Challenging Large Model Norms”

**Keywords:** USC study, ChatGPT parameters, model performance

**News Content:**

_A recent study from the University of Southern California (USC) suggests ChatGPT’s parameters might amount to a mere 7 billion_

According to QbitAI, a team of researchers from USC conducted an in-depth analysis to estimate the unpublicized parameter count of OpenAI’s GPT-3.5-turbo model. The trio of authors successfully deciphered the model’s embedding vector dimensions, determining them to be either 4096 or 4608. This finding holds significant implications for understanding the inner workings of large language models.

Based on open-source large models like Llama and Mistral, when the embedding vector dimension is 4096, the parameter count generally falls around 7 billion. Given the relationship between model performance and parameter size, an overly wide or narrow network can negatively impact efficiency and accuracy. Consequently, the USC team hypothesizes that GPT-3.5-turbo might also have a similar parameter count, unless it employs unconventional architectures like the Mixture of Experts (MoE), which could alter the typical ratio between parameter scale and performance.

This study offers a fresh perspective on GPT-3.5-turbo’s performance and provides theoretical grounds for optimizing and designing large language models. While OpenAI has yet to officially disclose these figures, USC’s research undoubtedly furnishes the industry with valuable reference information.

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注