中国电信开源星辰语义大模型TeleChat-7B

作者智能小编

2 月 19, 2024 #中国电信, #开源, #星辰语义大模型, #每日AI快讯

最新消息

中国电信于1月10日宣布开源其星辰语义大模型TeleChat-7B版本，并开放了1TB的清洗数据集供研究者和开发者使用。这一举措旨在推动人工智能领域的研究和应用，进一步促进开源大模型生态的发展。

星辰语义大模型是由中国电信旗下中电信人工智能科技有限公司研发和训练的大语言模型，该模型采用了1.5万亿Tokens的中英文语料进行训练，因此在处理自然语言理解和生成方面表现出色。此次开源的TeleChat-7B版本，不仅可以让研究者和开发者有机会深入了解和应用该模型，还可以为后续的研究和开发提供丰富的数据资源。

据悉，中国电信还计划在1月20日开源12B版本的模型，进一步扩大开发者和研究者的参与范围，共同建设开源大模型生态。这一系列的开源举措，展示了中国电信在人工智能领域的开放态度和创新精神，也为推动我国人工智能的发展贡献了力量。

Title: China Telecom opensources its StarChat Semantic Large Model TeleChat-7B
Keywords: China Telecom, opensource, StarChat semantic large model

News content:
China Telecom announced on January 10th the opensourcing of its StarChat Semantic Large Model TeleChat-7B version, along with the release of a 1TB cleaned dataset for researchers and developers to utilize. This move aims to promote research and application in the field of artificial intelligence, further fostering the development of the opensource large model ecosystem.

The StarChat Semantic Large Model, developed and trained by China Telecom’s subsidiary China Telecom Artificial Intelligence Technology Co., Ltd., is a large language model that has been trained on 1.5 quadrillion Tokens of Chinese and English corpus. It thus excels in natural language understanding and generation. The opensourcing of the TeleChat-7B version not only allows researchers and developers to delve into and apply the model, but also provides a rich dataset resource for subsequent research and development.

It is reported that China Telecom also plans to opensource the 12B version model on January 20th, further expanding the participation scope of developers and researchers, jointly building an opensource large model ecosystem. This series of opensourcing initiatives demonstrate China Telecom’s open attitude and innovative spirit in the field of artificial intelligence, contributing to the development of AI in our country.

【来源】https://www.ithome.com/0/744/969.htm