**南加州大学研究揭示ChatGPT参数规模可能仅为70亿**
据量子位报道,南加州大学的一组研究人员近期进行了一项深入研究,对OpenAI未公开的GPT-3.5-turbo模型的参数规模进行了推测。研究团队的三位作者成功破解了该模型的嵌入向量维度,确定其为4096或4608。这一发现对于理解大型语言模型的内部构造具有重要意义。
根据已知的开源大模型,如Llama和Mistral,当嵌入向量维度为4096时,其参数规模通常在70亿左右。考虑到模型性能与参数规模之间的关系,网络过宽或过窄都会对模型的效率和准确性产生负面影响。因此,南加大团队基于这一规律推测,GPT-3.5-turbo的参数规模可能也接近70亿,除非采用了特殊的模型架构,如混合专家(MoE)架构,这可能会改变参数规模与性能之间的常规比例关系。
这项研究为理解GPT-3.5-turbo的性能提供了新的视角,同时也对大型语言模型的优化和设计提供了理论依据。尽管OpenAI尚未正式公布这一数据,但南加州大学的研究成果无疑为业界提供了有价值的参考信息。
英语如下:
**News Title:** “USC Study Uncovers ChatGPT Parameter Mystery: Possibly Just 7 Billion, Challenging Large Model Norms”
**Keywords:** USC study, ChatGPT parameters, model performance
**News Content:**
_A recent study from the University of Southern California (USC) suggests ChatGPT’s parameters might amount to a mere 7 billion_
According to QbitAI, a team of researchers from USC conducted an in-depth analysis to estimate the unpublicized parameter count of OpenAI’s GPT-3.5-turbo model. The trio of authors successfully deciphered the model’s embedding vector dimensions, determining them to be either 4096 or 4608. This finding holds significant implications for understanding the inner workings of large language models.
Based on open-source large models like Llama and Mistral, when the embedding vector dimension is 4096, the parameter count generally falls around 7 billion. Given the relationship between model performance and parameter size, an overly wide or narrow network can negatively impact efficiency and accuracy. Consequently, the USC team hypothesizes that GPT-3.5-turbo might also have a similar parameter count, unless it employs unconventional architectures like the Mixture of Experts (MoE), which could alter the typical ratio between parameter scale and performance.
This study offers a fresh perspective on GPT-3.5-turbo’s performance and provides theoretical grounds for optimizing and designing large language models. While OpenAI has yet to officially disclose these figures, USC’s research undoubtedly furnishes the industry with valuable reference information.
【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw
Views: 1