Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

上海的陆家嘴
0

清华联合智谱AI推出长文本生成模型LongWriter,突破AI写作长度限制

北京,2024年8月28日 – 清华大学与智谱AI联合推出的长文本生成模型LongWriter正式发布,该模型能够生成超过10,000字的连贯文本,突破了以往AI模型在文本长度上的限制,为长篇写作领域带来了新的可能性。

LongWriter的诞生源于对现有大型语言模型(LLMs)输出限制的深入分析。研究团队发现,现有模型的输出长度限制主要源自于监督式微调(SFT)数据集的特性。为了解决这一问题,LongWriter团队创建了名为“LongWriter-6k”的数据集,该数据集包含从2,000到32,000字不等的写作样本,为模型训练提供了丰富的长文本数据。

除了数据集的构建,LongWriter还采用了“AgentWrite”方法和直接偏好优化(DPO)技术来进一步提升模型的长文本生成能力。AgentWrite方法使用现有的LLMs自动构建超长输出SFT数据,通过分而治之的策略,有效提升了模型的长文本生成能力。DPO技术则通过优化模型参数,提高了输出质量和遵循长度约束的能力。

LongWriter的技术原理基于具有显著增加的内存容量的长上下文大型语言模型(LLMs),能够处理超过100,000个token的历史记录。这意味着模型可以理解和处理更长的上下文信息,从而生成更加连贯、逻辑清晰的长文本。

LongWriter的应用场景十分广泛,包括:

  • 学术研究: 辅助学者和研究人员撰写长篇学术论文、研究报告或文献综述。
  • 内容创作:帮助作家和内容创作者生成小说、剧本或其他创意写作的初稿。
  • 出版行业: 辅助编辑和校对工作,或自动生成书籍内容。
  • 教育领域: 生成教学材料、课程内容或学习指南。
  • 新闻媒体: 快速生成新闻报道、深度分析文章或专题报道。

LongWriter的开源项目地址:

  • GitHub仓库:https://github.com/THUDM/LongWriter
  • HuggingFace模型库:https://huggingface.co/THUDM/LongWriter-glm4-9b
  • arXiv技术论文:https://arxiv.org/pdf/2408.07055

LongWriter的发布标志着AI写作领域迈出了重要一步,为长篇文本生成提供了新的解决方案。未来,随着技术的不断发展,LongWriter有望在更多领域发挥作用,为人类创造更大的价值。

【source】https://ai-bot.cn/longwriter/

Views: 2

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注