Beijing, China – A collaborative research team from Renmin University of China’s STILL project, the Beijing Academy of Artificial Intelligence (BAAI), and leading data science platform DataCanvas, has achieved a significant breakthrough in large language model (LLM) reasoning. The team has successfully replicated and improved upon the R1-like reasoning model, open-sourcing the complete code for training and deployment, and surpassing the performance of DeepSeek-R1 on a key mathematical reasoning benchmark.
The team’s work, detailed in the paper An Empirical Study on Eliciting and Improving R1-like Reasoning Models available on arXiv, focuses on slow thinking reasoning techniques in LLMs. They have not only replicated the R1 model, providing valuable insights into its implementation and training, but have also innovatively enhanced its performance by incorporating code tools into the reasoning process.
This innovative approach has yielded impressive results. The team’s model, STILL-3-Tool-32B, achieved an accuracy of 81.70% on the AIME 2024 benchmark (sampling), exceeding the performance of the full-blooded DeepSeek-R1.
Our goal was to not only replicate the R1 model but to also provide the research community and industry with a readily deployable, industrial-grade large model training framework, said a spokesperson for the DataCanvas team. By open-sourcing the complete code, including our validated technical experience and tuning strategies, we hope to accelerate the development and application of advanced LLMs.
The open-source solution includes the entire chain from model training to inference deployment, offering developers a practical and accessible platform for building and refining their own LLMs. This comprehensive approach is expected to significantly lower the barrier to entry for researchers and developers looking to explore and leverage the power of large language models.
The team’s achievement is particularly noteworthy for its emphasis on practical application and knowledge sharing. By openly sharing their code, techniques, and tuning strategies, they are fostering a collaborative environment that will undoubtedly accelerate progress in the field of LLM research and development.
The open-source code and further details are available on GitHub: [Insert GitHub Link Here – This would be populated with the actual link from the source document if available].
This breakthrough underscores China’s growing prominence in the field of artificial intelligence and its commitment to open-source collaboration. The team’s work provides a valuable resource for the global AI community and paves the way for further advancements in LLM reasoning and application.
References:
- An Empirical Study on Eliciting and Improving R1-like Reasoning Models. (2024). arXiv. [Insert arXiv Link Here – This would be populated with the actual link from the source document]
Views: 0