近日,中国电信宣布开源其星辰语义大模型TeleChat-7B版本,并开放1T清洗数据集。这款由中电信人工智能科技有限公司研发训练的大语言模型,采用1.5万亿Tokens中英文语料进行训练。此外,中国电信还计划在1月20日开源12B版本模型,以拥抱更多开发者共建开源大模型生态。

TeleChat-7B版本的开源举措将进一步推动我国人工智能领域的发展,让更多人参与到大模型研发中来。这款模型在语义理解、自然语言处理等方面具有广泛的应用前景。通过开放1T清洗数据集,将为开发者提供丰富的数据资源,助力他们在人工智能领域创新突破。

随着12B版本模型的即将开源,中国电信将继续推动星辰语义大模型的发展,使其在更多领域得到应用。这款模型的开源版本将为我国人工智能产业带来新的机遇,也让更多人享受到人工智能技术带来的便利。

英文翻译:

News Title: China Telecom opensources TeleChat-7B semantic large model
Keywords: Open source, TeleChat-7B, Large language model, Dataset

News Content:

Recently, China Telecom announced the opening of the TeleChat-7B version of its XingChen semantic large model, and opened a 1T cleaned dataset. The large language model, trained with 1.5 trillion Tokens of Chinese and English corpus, was developed and trained by China Telecom Artificial Intelligence Technology Co., Ltd. In addition, China Telecom plans to open source the 12B version model on January 20th, embracing more developers to build an open-source large model ecosystem together.

The opening of the TeleChat-7B version will further promote the development of China’s artificial intelligence field and allow more people to participate in the research and development of large models. The model has broad application prospects in semantic understanding, natural language processing, and other fields. By opening a 1T cleaned dataset, developers will have access to rich data resources, enabling them to innovate and break through in the field of artificial intelligence.

With the upcoming opening of the 12B version model, China Telecom will continue to promote the development of the XingChen semantic large model, allowing it to be applied in more fields. The open-source version of this model will bring new opportunities for China’s artificial intelligence industry and allow more people to enjoy the benefits of artificial intelligence technology.

【来源】https://www.ithome.com/0/744/969.htm

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注