中国电信于1月10日宣布开源其星辰语义大模型TeleChat-7B版本,同时开放了1TB的清洗数据集供研究者和开发者使用。这一举措标志着中国电信在人工智能领域的一大步进,并显示了其致力于推动开源大模型生态系统的建设。
星辰语义大模型是由中国电信人工智能科技有限公司研发和训练的语言模型,其训练数据采用了1.5万亿Tokens的中英文语料,这使得模型在理解和生成自然语言方面具有极高的准确性。
中国电信计划在1月20日开源12B版本的模型,进一步扩大开发者的参与度,共同构建开源大模型生态系统。这一系列的开源行动将为人工智能领域的研究和应用带来新的机遇,同时也推动了中国在人工智能领域的创新和发展。
英文翻译:
Chinese Telecom Opensources StarrySemantic Large Model TeleChat-7B
Keywords: Telecom opensource, Starry semantic, Large model
The Chinese Telecom company has announced the opensourcing of the StarrySemantic Large Model TeleChat-7B version on January 10, while also opening a 1TB cleaned dataset for researchers and developers. This move signifies a significant step forward for Chinese Telecom in the field of artificial intelligence and demonstrates its commitment to promoting the construction of an open-source large model ecosystem.
The StarrySemantic Large Model was developed and trained by the Chinese Telecom Artificial Intelligence Technology Co., Ltd. using 1.5 quadrillion Tokens of Chinese-English bilingual corpus, which enables the model to have a high level of accuracy in understanding and generating natural language.
Chinese Telecom plans to opensource the 12B version model on January 20, further expanding the participation of developers and jointly building an open-source large model ecosystem. This series of opensource initiatives will bring new opportunities to the research and application of the artificial intelligence field and also promote innovation and development in China’s artificial intelligence sector.
【来源】https://www.ithome.com/0/744/969.htm
Views: 1