Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

上海枫泾古镇一角_20240824上海枫泾古镇一角_20240824
0

Beijing, China – A collaborative research team from Renmin University of China’s STILL project, the Beijing Academy of Artificial Intelligence (BAAI), and leading data science platform DataCanvas, has achieved a significant breakthrough in large language model (LLM) reasoning. The team has successfully replicated and improved upon the R1-like reasoning model, open-sourcing the complete code for training and deployment, and surpassing the performance of DeepSeek-R1 on a key mathematical reasoning benchmark.

The team’s work, detailed in the paper An Empirical Study on Eliciting and Improving R1-like Reasoning Models available on arXiv, focuses on slow thinking reasoning techniques in LLMs. They have not only replicated the R1 model, providing valuable insights into its implementation and training, but have also innovatively enhanced its performance by incorporating code tools into the reasoning process.

This innovative approach has yielded impressive results. The team’s model, STILL-3-Tool-32B, achieved an accuracy of 81.70% on the AIME 2024 benchmark (sampling), exceeding the performance of the full-blooded DeepSeek-R1.

Our goal was to not only replicate the R1 model but to also provide the research community and industry with a readily deployable, industrial-grade large model training framework, said a spokesperson for the DataCanvas team. By open-sourcing the complete code, including our validated technical experience and tuning strategies, we hope to accelerate the development and application of advanced LLMs.

The open-source solution includes the entire chain from model training to inference deployment, offering developers a practical and accessible platform for building and refining their own LLMs. This comprehensive approach is expected to significantly lower the barrier to entry for researchers and developers looking to explore and leverage the power of large language models.

The team’s achievement is particularly noteworthy for its emphasis on practical application and knowledge sharing. By openly sharing their code, techniques, and tuning strategies, they are fostering a collaborative environment that will undoubtedly accelerate progress in the field of LLM research and development.

The open-source code and further details are available on GitHub: [Insert GitHub Link Here – This would be populated with the actual link from the source document if available].

This breakthrough underscores China’s growing prominence in the field of artificial intelligence and its commitment to open-source collaboration. The team’s work provides a valuable resource for the global AI community and paves the way for further advancements in LLM reasoning and application.

References:

  • An Empirical Study on Eliciting and Improving R1-like Reasoning Models. (2024). arXiv. [Insert arXiv Link Here – This would be populated with the actual link from the source document]


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注