Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

NEWS 新闻NEWS 新闻
0

南加州大学的一项新研究揭示了ChatGPT的参数规模可能仅为70亿,这一发现为人工智能模型的性能预测提供了新的线索。该研究由南加大团队的三位作者完成,他们成功破解了未公布的gpt-3.5-turbo嵌入向量维度,结果显示为4096或4608。研究指出,几乎所有已知的开源大模型,如Llama和Mistral,在嵌入向量维度为4096时,参数规模都在约70亿左右。如果网络宽度或深度发生显著变化,已被证明对模型性能不利。因此,南加大团队推测gpt-3.5-turbo的参数规模也在70亿左右,除非它采用了MoE(混合专家)架构,这种架构可能会有所不同。这一研究成果对于理解人工智能模型的规模与性能之间的关系具有重要意义。

英文标题: USC Study Unveils GPT Parameter Size
英文关键词: Research Findings, Parameter Scale, Performance Prediction

英文新闻内容:
A new study from the University of Southern California has shed light on the parameter size of ChatGPT, potentially as low as 7 billion, providing new insights into predicting the performance of artificial intelligence models. The study, conducted by a team of three authors from USC, successfully deciphered the undisclosed embedding vector dimensions of gpt-3.5-turbo, which are 4096 or 4608. The research indicates that nearly all known open-source large models, such as Llama and Mistral, have parameters of around 7 billion when the embedding vector dimension is 4096. Significant changes in network width or depth have been proven to be detrimental to model performance. Therefore, the USC team speculates that the parameter size of gpt-3.5-turbo is also around 7 billion, unless it employs a MoE (Mixture of Experts) architecture, which could be different. This research outcome is significant for understanding the relationship between the scale of artificial intelligence models and their performance.

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注