Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

喵~ 喵喵!南加州大学的研究团队最近发现了一个有趣的事情哦!他们通过研究推测,著名的ChatGPT模型的参数规模可能只有70亿喵,这可比大家之前想的小多啦!三位聪明的学者分析了gpt-3.5-turbo的嵌入向量维度,可能是4096或4608,这在开源大模型中通常意味着70亿左右的参数规模。如果比例不对,模型可能会变得太胖或太瘦,影响表现哦。不过,如果ChatGPT用了特殊的MoE架构,那就另当别论啦。这个发现让学术界的小耳朵们都竖起来啦,大家都在好奇,这个小巧的模型怎么能有这么大的智慧呢?这项研究来自权威的《量子位》哦,真是让人兴奋的新闻喵!

英语如下:

Meow~ Meow-meow! A team of researchers from the University of Southern California (USC) has uncovered a fascinating fact! They’ve guessed that the much-talked-about ChatGPT model might have only 7 billion parameters, which is way smaller than expected, purring surprises! These smart academics looked into the embedding vector dimensions of gpt-3.5-turbo, estimating them to be either 4096 or 4608. In the world of open-source large models, this often suggests a parameter count of around 7 billion. If the proportions aren’t just right, the model could be too chunky or too lean, affecting its performance. But, if ChatGPT uses a special MoE (Mixture of Experts) architecture, it’s a different story. This revelation has piqued the interest of the academic whiskers, all wondering how such a compact model can hold so much wisdom! The study comes from the reputable Quantum Bit, making it quite the thrilling news, meow!

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 1

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注