Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

Peking University Unveils Lift3D: Empowering 2D Large Language Modelswith Robust 3D Manipulation Capabilities

A groundbreaking new model from Peking Universityand the Beijing Academy of Artificial Intelligence (BAAI) enhances 2D large-scale pretrained models, enabling them to perform robust 3D robotic manipulation tasks.

The ability of artificial intelligence to interact effectively with the physical world remains a significant challenge. While 2D large language models (LLMs) haveachieved remarkable success in various domains, their application to complex 3D tasks, such as robotic manipulation, has been limited. This limitation stems from the inherent difference between the 2D nature of the data these models are trained on andthe three-dimensional reality of robotic interaction. To address this, a team from Peking University, led by Shang-Hang Zhang, has developed Lift3D, a novel system that systematically enhances the 3D robotic representation capabilities of2D LLMs.

Lift3D achieves this enhancement through a two-pronged approach. First, it systematically strengthens both the implicit and explicit 3D robotic representations within the 2D pre-trained model. This involves incorporating 3D spatial understanding and reasoning capabilities directly into the model’s architecture. Second, Lift3D directly encodes point cloud data, enabling the model to learn from and interact with the rich, detailed information provided by 3D sensor inputs. This direct encoding allows for more accurate and nuanced 3D understanding, surpassing the limitations of relying solely on 2D image data. The modelthen employs 3D imitation learning, allowing it to learn complex manipulation skills by observing and mimicking expert demonstrations.

The researchers have rigorously tested Lift3D in diverse simulated and real-world environments. Results demonstrate state-of-the-art (SOTA) manipulation performance, showcasing the model’s impressive generalizationand scalability. The team’s findings, published on arXiv (https://arxiv.org/pdf/2411.18623), highlight the potential of Lift3D to bridge the gap between the capabilities of 2D LLMs and the demands of real-world 3D roboticapplications. The authors of the paper include Jiaming Liu, Yue-Ru Jia, Sixiang Chen, Chenyang Gu, Zhilue Wang, and Longzan Luo, all PhD students or researchers at Peking University. The research was conducted by the HMI Lab at Peking University, a leading research group inembodied intelligence and multimodal learning.

This advancement holds significant implications for various fields, including robotics, automation, and manufacturing. The ability to seamlessly integrate the power of 2D LLMs with the dexterity of 3D robotic manipulation opens doors to more sophisticated and adaptable robotic systems capable of performing a wider range of complextasks. Future research directions may focus on improving the model’s robustness in even more challenging and unpredictable environments, as well as exploring its applications in collaborative robotics and human-robot interaction.

Conclusion:

Lift3D represents a significant step forward in the field of embodied AI. By effectively leveraging the strengths of2D LLMs and extending their capabilities to the 3D realm, this innovative model paves the way for more advanced and versatile robotic systems. The research highlights the potential of combining different AI paradigms to achieve breakthroughs in complex real-world applications. The robust performance and scalability of Lift3D suggest apromising future for AI-driven robotic manipulation.

References:

  • Liu, J., Jia, Y., Chen, S., Gu, C., Wang, Z., Luo, L., & Zhang, S. (2024). Lift3D Foundation Policy: Lifting 2D Large-ScalePretrained Models for Robust 3D Robotic Manipulation. arXiv preprint arXiv:2411.18623. (Retrieved from https://arxiv.org/pdf/2411.18623)
  • Machine Intelligence. (2024, December 9).3D Embodied Foundation Model! Peking University Proposes Lift3D to Endow 2D Large Models with Robust 3D Manipulation Capabilities. [Online blog post]. Retrieved from [Insert original Machine Intelligence article URL here if available].


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注