Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

90年代的黄河路
0

San Francisco, CA (March 19, 2025) – Imagine turning a simple video into a vast, virtual training ground for robots. This vision is closer to reality thanks to SpatialLM, a groundbreaking spatial understanding model open-sourced by Coohom (群核科技) at the GTC 2025 Global Conference. This innovative framework, built upon large language models (LLMs), promises to revolutionize embodied AI by enabling robots to better understand and interact with the physical world.

Traditional LLMs often struggle with the complexities of spatial relationships and geometric understanding. SpatialLM overcomes these limitations by providing machines with human-like spatial awareness and analytical capabilities. This represents a significant leap forward, offering a foundational training framework for embodied intelligence. Companies can now fine-tune the SpatialLM model for specific applications, significantly reducing the barrier to entry for training robots in diverse environments.

According to Coohom, the SpatialLM model can generate physically accurate 3D scene layouts from a single video. By leveraging point cloud data extracted from the video, the model can accurately recognize and understand the structured information within the scene. This capability opens up a wealth of possibilities for training robots in simulated environments that closely mirror real-world conditions.

The SpatialLM model is now available to developers worldwide on platforms like HuggingFace, GitHub, and the ModelScope (魔搭社区). This open-source approach encourages collaboration and accelerates innovation in the field of embodied AI.

We aim to create a closed-loop embodied intelligence training platform, from spatial cognitive understanding to spatial action interaction, said a technical lead at Coohom. The open-sourced SpatialLM spatial understanding model is designed to help embodied intelligent robots complete basic training in spatial cognitive understanding. SpatialVerse, the spatial intelligence solution released by Coohom last year, aims to further promote spatial intelligence through cooperation.

The company plans to continue iterating on the SpatialLM model, adding features such as natural language interaction and scene interaction. This ongoing development will further enhance the model’s capabilities and make it an even more valuable tool for researchers and developers working on embodied AI applications.

The potential impact of SpatialLM is immense:

  • Accelerated Robot Training: By creating realistic virtual environments from video, SpatialLM significantly reduces the time and cost associated with training robots in the real world.
  • Enhanced Spatial Understanding: The model’s ability to accurately interpret spatial relationships and geometric information enables robots to navigate and interact with their environment more effectively.
  • Democratization of Embodied AI: The open-source nature of SpatialLM lowers the barrier to entry for researchers and developers, fostering innovation and collaboration in the field.

SpatialLM represents a significant step towards a future where robots can seamlessly interact with the physical world. By providing a powerful and accessible tool for spatial understanding, Coohom is empowering the next generation of embodied AI applications.

References:

  • Coohom (群核科技) Official Website: (Insert Official Website if Available)
  • HuggingFace: (Insert HuggingFace Link if Available)
  • GitHub: (Insert GitHub Link if Available)
  • ModelScope (魔搭社区): (Insert ModelScope Link if Available)

[End of Article]


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注