HumanVid: A High-Quality Dataset for Human Image Animation
Hong Kong, China – A new dataset designed specifically for training human image animation models hasbeen released by researchers at the Chinese University of Hong Kong and the Shanghai Artificial Intelligence Laboratory. Called HumanVid, this dataset aims to improve the controllability andstability of video generation by providing high-quality data and detailed annotations.
HumanVid combines real-world videos and synthetic data, carefully curated through a rule-based filtering process to ensure high quality. The dataset includes annotations for both human body and camera motion, leveraging 2D pose estimation and SLAM (Simultaneous Localization and Mapping) techniques. This meticulous annotation process provides valuable information for training modelsthat can generate videos with precise control over character poses and camera movements.
HumanVid is a significant step forward in the field of human image animation, says Dr. Zhenzhi Wang, lead researcher on the project. By combining real-world and synthetic data, we’ve created a dataset that is both diverse and high quality. The detailed annotations allow for more precise control over the generated videos, leading to more realistic and expressive animations.
Key Features of HumanVid:
- High-Quality Data Integration: Combines real-world andsynthetic data for a rich and diverse dataset.
- Copyright-Free: All videos and 3D avatar assets are copyright-free, facilitating research and usage.
- Rule-Based Filtering: Ensures high-quality videos through a rigorous filtering process.
- Human and Camera Motion Annotations:Provides precise annotations for both human body and camera movements using 2D pose estimation and SLAM techniques.
Technical Principles of HumanVid:
- Dataset Construction: HumanVid builds its dataset by collecting copyright-free real-world videos from the internet and supplementing them with synthetic data. The videos undergo a rigorousrule-based filtering process to ensure high quality.
- Annotation Techniques: 2D pose estimators are used to annotate human actions in the videos, while a SLAM-based approach is employed to annotate camera movements.
- Synthetic Data Generation: To enhance dataset diversity, HumanVid collects copyright-free 3D avatar assets and introduces rule-based camera trajectory generation methods to simulate various camera movements.
- Model Training: HumanVid establishes a baseline model called CamAnimate, which considers human body and camera motion as conditions. Training on the HumanVid dataset enables CamAnimate to generate videos with controlled character poses andcamera movements.
Applications of HumanVid:
- Video Production: Provides high-quality animation generation for film, television, and other video content, empowering directors and producers to create more vivid and realistic scenes by controlling character poses and camera movements.
- Game Development: Generates realistic NPC (Non-PlayerCharacter) animations in video games, enhancing immersion and interactivity.
- VR and AR: Creates virtual characters that interact with users in VR and AR applications, offering more natural and seamless experiences.
- Education and Training: Facilitates the creation of instructional videos that simulate character actions and scenarios, helping students betterunderstand and learn complex concepts.
The HumanVid project is expected to be publicly available in September 2024, with both the code and dataset released on GitHub. This release will provide researchers and developers with a valuable tool for advancing the field of human image animation, leading to more realistic and expressive video content acrossvarious applications.
Conclusion:
HumanVid represents a significant advancement in the field of human image animation, offering a high-quality dataset with detailed annotations that enable more precise control over video generation. This dataset has the potential to revolutionize the creation of realistic and expressive animations, with applications ranging from video production and gamedevelopment to VR/AR and education. The public release of HumanVid in September 2024 is anticipated to spark significant innovation and progress in this rapidly evolving field.
【source】https://ai-bot.cn/humanvid/
Views: 1