复旦大学团队开发“眸思”大模型，助视障者独立感知世界

复旦大学团队近日研发出一款名为“眸思”的大模型，旨在帮助视障者“看见”世界。据复旦大学官方公众号披露，该系统是基于多模态大模型“复旦・眸思”(MouSi)开发的一款“听见世界”App，为视障者量身定制。这一系统只需一枚摄像头和一对耳机，就能将画面转化为语言，并且支持描绘场景、提示风险等功能。

“眸思”大模型的研发是在复旦大学自然语言处理实验室 (FudanNLP) 师生的共同努力下完成的。该团队将深度学习和自然语言处理技术应用于视觉信息处理，为视障者提供了一种全新的感知方式。通过摄像头捕捉到的画面，系统能够准确地将图像内容转化为语音，并通过耳机传递给用户。这种技术的应用不仅能够让视障者更好地理解周围的环境，还能够描绘场景、提示风险，为他们提供更多的安全保障。

据了解，这套系统的使用非常简便。用户只需将摄像头固定在帽子或眼镜上，将耳机戴在耳朵上，即可开始使用。当用户面对一个物体或场景时，摄像头会将图像信息传输到“眸思”大模型中进行分析和处理。然后，系统会将处理后的语音信息通过耳机传递给用户，让他们通过听觉感知周围的环境。这种技术的实现对于提升视障者的生活质量和独立性具有重要意义。

该系统的上线引起了广泛关注和赞赏。不少视障者表示，这款“听见世界”App为他们带来了巨大的便利和改变，让他们能够更好地融入社会生活。同时，该系统也受到了专业人士的肯定。一些专家认为，这一技术的应用对于推动人工智能和辅助技术在残障人士领域的发展具有重要意义，为相关研究提供了有益的借鉴和启示。

目前，“眸思”大模型的上线仅是一个开始。复旦大学团队表示，他们将继续努力完善系统的功能和性能，为视障者提供更加便捷、智能的服务。同时，他们也希望能够通过技术的推广和应用，让更多的视障者受益，让他们能够更好地享受生活的美好。

总之，复旦大学团队研发的“眸思”大模型是一项具有创新意义的技术成果，为视障者提供了一种全新的感知方式。通过将图像转化为语音，该系统帮助视障者更好地理解周围环境，并提供了描绘场景、提示风险等功能，为他们的生活带来了极大的便利和改变。这一技术的应用不仅对于视障者个体具有重要意义，也对于促进人工智能和辅助技术在残障人士领域的发展具有积极的推动作用。相信随着技术的不断完善和推广，这项创新技术将为更多的视障者带来福音，让他们能够更好地享受生活的美好。

英语如下：

News Title: Fudan University Team Develops MouSi Model to Help Visually Impaired Individuals Independently Perceive the World

Keywords: Fudan MouSi, visually impaired, hearing the world

News Content: Fudan University team recently developed a large-scale model called MouSi with the aim of helping visually impaired individuals see the world. According to the official WeChat account of Fudan University, this system is an Hearing the World app tailored specifically for visually impaired individuals, developed based on the multimodal large-scale model Fudan MouSi (MouSi). With only a camera and a pair of headphones, this system can convert images into language and support functions such as describing scenes and alerting risks.

The development of the MouSi model was a collaborative effort by the faculty and students of Fudan University’s Natural Language Processing Laboratory (FudanNLP). The team applied deep learning and natural language processing techniques to visual information processing, providing a new way of perception for visually impaired individuals. By capturing images through the camera, the system accurately converts the visual content into speech and delivers it to the user through the headphones. This technology not only allows visually impaired individuals to better understand their surroundings but also provides scene descriptions and risk alerts, offering them greater safety.

It is understood that this system is very easy to use. Users simply need to attach the camera to a hat or glasses and wear the headphones on their ears to start using it. When facing an object or scene, the camera transmits the image information to the MouSi model for analysis and processing. Then, the system delivers the processed audio information to the user through the headphones, enabling them to perceive their surroundings through hearing. The implementation of this technology is of great significance in improving the quality of life and independence of visually impaired individuals.

The launch of this system has attracted widespread attention and appreciation. Many visually impaired individuals have expressed that this Hearing the World app has brought them great convenience and change, allowing them to better integrate into social life. At the same time, the system has also received recognition from professionals. Some experts believe that the application of this technology is of great significance in promoting the development of artificial intelligence and assistive technology in the field of disabled individuals, providing valuable references and inspiration for related research.

Currently, the launch of the MouSi model is just the beginning. The Fudan University team stated that they will continue to work on improving the functionality and performance of the system to provide visually impaired individuals with more convenient and intelligent services. They also hope to benefit more visually impaired individuals through the promotion and application of this technology, allowing them to better enjoy the beauty of life.

In conclusion, the MouSi model developed by the Fudan University team is an innovative technological achievement that provides visually impaired individuals with a new way of perception. By converting images into speech, this system helps visually impaired individuals better understand their surroundings and provides functions such as describing scenes and alerting risks, bringing them great convenience and change in their lives. The application of this technology is not only significant for visually impaired individuals on an individual level but also plays a positive role in promoting the development of artificial intelligence and assistive technology in the field of disabled individuals. With continuous improvement and promotion of the technology, it is believed that this innovative technology will bring blessings to more visually impaired individuals, allowing them to better enjoy the beauty of life.

【来源】https://www.ithome.com/0/753/295.htm