NASA与IBM合作推出专为五大科学领域定制的大型语言模型INDUS

近日,美国国家航空航天局(NASA)与IBM宣布合作开发出一套全面的大型语言模型(LLM)——INDUS。该模型专为地球科学、生物学、物理学、太阳物理学、行星科学和天体物理学领域量身定制,以应对自然语言理解和生成任务的挑战。

传统的通用语料库训练的LLM在特定领域的性能往往不尽如人意。因此,NASA与IBM共同研发了INDUS,该模型使用从不同数据源的精选科学语料库进行训练,旨在满足专业领域的需求。

据了解,INDUS包含编码器和句子Transformer两类模型。其中,编码器可将自然语言文本转换为LLM可处理的数字编码,其在包含多个科学领域的600亿个tokens的语料库上进行训练。

相关研究成果已经发表在arXiv预印平台,论文链接为:[https://arxiv.org/abs/2405.10725。INDUS的推出将]有助于科学家更高效地获取和理解科学信息,推动科学研究的进步。

此次合作是NASA与私营科技企业合作的一个典型案例,展现了公私合作在科技创新领域的巨大潜力。相信随着科技的不断进步,INDUS将在未来的科学研究中发挥越来越重要的作用。

英语如下:

News Title: NASA and IBM Collaborate to Create INDUS Language Model in Specialized Fields: The Rising Star of AI in Science

Keywords: NASA and IBM collaborate to develop large language model INDUS, tailored for five scientific fields, corpus training

News Content: NASA and IBM Collaborate to Launch INDUS, a Large Language Model Customized for Five Scientific Fields

Recently, the National Aeronautics and Space Administration (NASA) and IBM announced their collaboration to develop a comprehensive large language model (LLM) – INDUS. The model is tailored for the challenges of natural language understanding and generation tasks in the fields of Earth science, biology, physics, solar physics, planetary science, and astrophysics.

Traditional LLMs trained on general corpora often do not perform well in specific domains. Therefore, NASA and IBM jointly developed INDUS, which uses a selected scientific corpus from various data sources for training, aimed at meeting the needs of professional fields.

It is understood that INDUS includes two types of models, encoders and sentence Transformers. Encoders convert natural language text into digital encodings that can be processed by the LLM, trained on a corpus of 60 billion tokens across multiple scientific domains.

Relevant research results have been published on the arXiv preprint platform with the paper link: [https://arxiv.org/abs/2405.10725. The launch of INDUS is] expected to help scientists obtain and understand scientific information more efficiently, promoting the progress of scientific research.

This collaboration is a typical example of NASA’s partnership with private technology enterprises, demonstrating the enormous potential of public-private collaborations in the field of technological innovation. It is believed that with the continuous advancement of technology, INDUS will play an increasingly important role in scientific research in the future.

【来源】https://www.jiqizhixin.com/articles/2024-06-27-3

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注