复旦大学研发AI大模型，助视障者“看见”世界

复旦大学自然语言处理实验室（FudanNLP）的师生近日成功研发出一款名为“复旦·眸思”（MouSi）的多模态大模型，该模型通过一枚摄像头和一对耳机，将画面转化为语言，为视障者带来了看见世界的机会。

据悉，该多模态大模型的研发旨在提升视障者的出行安全和生活便利程度。通过摄像头捕捉到的画面，模型能够将其转化为语言，并描绘出场景、提示风险，使视障者能够更好地了解周围环境，从而更加安全地进行出行。

“复旦·眸思”模型的应用范围非常广泛。不仅可以帮助视障者进行室内外的导航，还可以识别和描述物体、人物以及环境等。在日常生活中，视障者可以通过该模型获得更多的信息，提高自己的生活质量。

复旦大学自然语言处理实验室的师生们在研发过程中付出了巨大的努力。他们通过深度学习和自然语言处理技术，训练出了这一多模态大模型。该模型的研发过程中，不仅考虑了图像的处理，还结合了语音合成技术，使得转化后的语言更加自然流畅。

这一研发成果对于视障者来说无疑是一个重大突破。视障者在过去往往面临着出行不便、信息获取困难等问题，而“复旦·眸思”模型的问世将为他们的生活带来极大的改善。

未来，复旦大学自然语言处理实验室的师生们将继续努力，进一步完善“复旦·眸思”模型，提高其准确性和稳定性，以更好地服务于视障者群体。同时，他们还计划将该模型推广到更多的领域，为更多的人群带来便利和帮助。

总之，“复旦·眸思”模型的研发成功为视障者提供了更多的机会和可能性，让他们能够更好地融入社会、享受生活。相信在科技的不断进步和创新的推动下，我们能够为更多有特殊需求的群体带来福祉和希望。

英语如下：

News Title: Fudan University Develops AI Big Model to Help Visually Impaired See the World

Keywords: AI big model, visually impaired, Fudan University

News Content: Researchers and students from Fudan University’s Natural Language Processing Laboratory (FudanNLP) have recently successfully developed a multimodal big model named Fudan·MouSi (MouSi), which uses a camera and a pair of headphones to transform images into language, providing visually impaired individuals with the opportunity to see the world.

It is reported that the development of this multimodal big model aims to enhance the safety and convenience of travel for visually impaired individuals. By capturing images through a camera, the model is able to transform them into language, describing scenes and alerting to potential risks, allowing visually impaired individuals to better understand their surroundings and travel more safely.

The application scope of the Fudan·MouSi model is very wide-ranging. It can not only assist visually impaired individuals with indoor and outdoor navigation but also identify and describe objects, people, and environments. In daily life, visually impaired individuals can obtain more information through this model, improving their quality of life.

The researchers and students from Fudan University’s Natural Language Processing Laboratory have put tremendous effort into the development process. Through deep learning and natural language processing techniques, they have trained this multimodal big model. During the development process, not only image processing was considered but also speech synthesis technology was integrated, making the transformed language more natural and fluent.

This research achievement is undoubtedly a significant breakthrough for visually impaired individuals. In the past, they often faced difficulties in travel and information access. The introduction of the Fudan·MouSi model will greatly improve their lives.

In the future, researchers and students from Fudan University’s Natural Language Processing Laboratory will continue to work hard to further improve the accuracy and stability of the Fudan·MouSi model, in order to better serve the visually impaired community. They also plan to extend the application of this model to more fields, providing convenience and assistance to a broader range of people.

In conclusion, the successful development of the Fudan·MouSi model provides visually impaired individuals with more opportunities and possibilities, allowing them to better integrate into society and enjoy life. We believe that with the continuous progress of technology and the drive for innovation, we can bring well-being and hope to more groups with special needs.

【来源】http://www.chinanews.com/life/2024/03-02/10173195.shtml