news studionews studio

出门问问日前宣布,将向公众开放其超大规模语言模型“序列猴子”的部分训练数据集,命名为“序列猴子开源数据集1.0”。本次开源的“序列猴子数据集1.0”包含了中文通用文本语料、古诗今译语料以及文本生成语料。这些数据集的开放旨在促进人工智能领域的发展,尤其是中文自然语言处理技术的进步。出门问问希望通过此举,鼓励更多的研究者和开发者利用这些丰富的数据资源,推动语言模型的创新和优化。

Title: Open Sesame to Sequence Monkey Dataset
Keywords: Open Sesame, Sequence Monkey, Open Source Dataset
News content:
Open Sesame, a leading AI technology company, has recently announced the opening of its Sequence Monkey, a significant language model’s training dataset to the public. The new open-source dataset, dubbed “Sequence Monkey Open Source Dataset 1.0,” includes a plethora of Chinese general text corpus, poetic verses in modern interpretation, and text generation materials. The release of this dataset aims to foster advancements in AI, particularly in the field of Chinese natural language processing. By providing these rich data resources, Open Sesame hopes to encourage researchers and developers to leverage them for further innovation and optimization of language models.

【来源】https://mp.weixin.qq.com/s/oSQR3gCCDpJ3Wdu-9iTcbA

Views: 1

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注