Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

news studionews studio
0

Beijing – The ripples caused by DeepSeek’s impact on the global AI landscape continue to spread. After Chinese large language models breached Silicon Valley’s defenses, the Chinese AI community, often perceived as lagging, has achieved a reverse technology transfer, sparking a global wave of DeepSeek replication efforts.

While DeepSeek-R1 is open-source, its training data and scripts remain largely undisclosed. However, the availability of a technical report provides a blueprint for replication, leading to aha moments for teams working with smaller models.

Leading the charge in this replication movement is the Hugging Face-led Open R1 project. Open R1 aims for complete and open replication of DeepSeek-R1, filling in all the undisclosed technical details. In just a few weeks, the project has achieved significant milestones, including:

  • GRPO implementation
  • Training and evaluation code
  • A generator for synthetic data

The project’s GitHub repository can be found at https://github.com/huggingface/open-r1.

Bolstered by the open-source community, Open R1 has made rapid progress. Today, they released the OpenR1-Math-220k dataset, adding another fragment to the DeepSeek R1 puzzle: synthetic data. This dataset comprises 220,000 high-quality data points, further empowering researchers and developers to replicate DeepSeek’s capabilities.

The release of this dataset marks a significant step towards democratizing access to advanced AI technology. By providing the necessary resources and knowledge, the open-source community is enabling a broader range of individuals and organizations to participate in the development and refinement of large language models. This collaborative approach promises to accelerate innovation and drive further advancements in the field of artificial intelligence.

References:


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注