Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

Chinese Startup Makes Waves: Step-2 Achieves Top 5 Ranking on Rigorous LLM Benchmark

A Chinese AI startup, Jieyue Xingchen,has shaken up the global large language model (LLM) landscape. Its trillion-parameter model, Step-2, secured a remarkable fifth-place ranking onLiveBench AI, a notoriously difficult benchmark considered the gold standard in the field. This achievement marks a significant milestone for Chinese AI development, as Step-2is the only domestically developed model to crack the top ten.

The LiveBench AI benchmark, launched in June 2024, is a collaborative effort spearheaded by Turing Award winner and Meta Chief AI Scientist Yann LeCun, alongwith Abacus.AI and New York University. Unlike many existing LLM benchmarks susceptible to manipulation, LiveBench employs a novel methodology designed to be impervious to gaming by LLMs. The benchmark comprises six categories and eighteen tasks, each updated monthly with new problems based on recently published datasets, arXiv papers, news articles, and IMDb movie summaries. This dynamic approach minimizes data contamination and ensures the ongoing validity of the assessment. Crucially, each problem possesses verifiable, objective ground truths, eliminating the need for LLM-based human evaluation.The benchmark’s rigorous nature has earned it a reputation as the world’s first unmanipulable large language model benchmark. https://livebench.ai/

Jieyue Xingchen’s success with Step-2 is particularly noteworthy given the competition. OnlyOpenAI and Anthropic models occupy the higher ranks above Step-2 on the LiveBench leaderboard. This accomplishment highlights the rapid advancements in Chinese AI technology and challenges the previously dominant position of American companies in the field.

The implications of this breakthrough are significant. It underscores the growing competitiveness of Chinese AI companies on theglobal stage and suggests a potential shift in the power dynamics of the LLM market. The success of Step-2 also validates Jieyue Xingchen’s approach to LLM development, emphasizing the importance of both scale (trillion-parameter model) and robust evaluation methodologies.

The future trajectory of LLM developmentremains uncertain, but Jieyue Xingchen’s achievement with Step-2 signals a new era of competition and innovation. Further research and development in this area are crucial to fully understand the capabilities and limitations of LLMs, and to ensure their responsible and ethical deployment. The continued evolution of benchmarks like LiveBench willbe instrumental in driving this progress.

References:

  • LiveBench AI. (n.d.). LiveBench AI. Retrieved from https://livebench.ai/
  • Machine Intelligence. (2024, November 19). Chinese StartupMakes Waves: Step-2 Achieves Top 5 Ranking on Rigorous LLM Benchmark. [Hypothetical news source – replace with actual source if available]

Note: This article is based on the provided Chinese text. Specific details regarding Jieyue Xingchen, Step-2, and the exactranking positions might need verification from official sources. The reference to a hypothetical news source should be replaced with the actual source of the original news report if available.


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注