ByteDance Unveils GR-2: A Robot AI Model That Learns Likea Human
Beijing, China – ByteDance, the tech giant behindpopular apps like TikTok and Douyin, has announced the release of its second-generation robot AI model, GR-2 (Generative Robot 2.0). This groundbreaking model distinguishes itself by introducing a novel robot infancy learning phase, mimicking the way humans learn complex tasks during their development. This approach grants GR-2exceptional generalization capabilities and versatility across multiple tasks.
GR-2, like many other AI models, undergoes both pre-training and fine-tuning phases. During pre-training, GR-2 watched a massive dataset of 38million internet videos and 50 billion tokens from various public sources. These videos encompass diverse everyday scenarios, including homes, outdoor spaces, and offices, enabling GR-2 to develop the ability to generalize across a wide range of robotic tasks and environmentsin subsequent policy learning.
In the fine-tuning phase, the team used robot trajectory data to fine-tune video generation and action prediction, demonstrating GR-2’s exceptional multi-task learning capabilities. Across over 100 tasks, GR-2 achieved an average success rate of 97.7%. Notably, GR-2 exhibited outstanding generalization abilities in novel, previously unseen scenarios, including new backgrounds, environments, objects, and tasks.
This innovative approach to robot learning holds significant promise for the future of robotics. By mimicking human development, GR-2 demonstrates the potential for AI models to learn and adapt to complex, real-world situations with greater flexibility and efficiency. This advancement could pave the way for robots that are more versatile, adaptable, and capable of performing a wider range of tasks in diverse environments.
References:
- GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for RobotManipulation
- ByteDance推机器人大模型 GR-2 展现智能自主操作新高度
- GR-2 登场!ByteDance Research 提出机器人大模型,具备世界建模和强大泛化能力
Views: 0