Meta承认使用“盗版”书籍训练AI，拒绝支付版权费：合理使用争

**Meta承认使用未授权书籍训练人工智能，引发版权争议**

近日，科技巨头Meta在一场诉讼中公开承认，其在开发人工智能模型Llama 1和Llama 2的过程中，使用了包含近20万本图书的开源数据集Books3。这一数据集的总容量高达37GB，其中包括了大量受版权保护的作品。然而，Meta表示，它们无需为这一行为获得“同意、许可或付费”，并主张其对受版权保护作品的复制应被视为“合理使用”。

Meta的这一立场引发了版权界的广泛关注和讨论。公司认为，使用这些数据训练人工智能模型是为了推动科技进步，符合公众利益，因此符合“合理使用”的原则。然而，这一论点可能在法律上面临挑战，因为“合理使用”在不同的司法管辖区有着不同的解释和标准，通常需要权衡四个主要因素：使用的目的、作品的性质、使用的数量以及对潜在市场的影响。

随着人工智能技术的快速发展，如何在尊重知识产权和推动技术创新之间找到平衡，已经成为业界和法律界亟待解决的问题。此次Meta的案例无疑将为未来的版权法实践和讨论提供一个鲜活的案例。

英语如下：

**News Title:** “Meta Acknowledges Using ‘Pirated’ Books to Train AI, Sparks Copyright Dispute Over Fair Use”

**Keywords:** Meta, Artificial Intelligence, Copyright Dispute

**News Content:**

Tech giant Meta has recently admitted in a legal suit that it utilized an open-source dataset, Books3, comprising nearly 200,000 books, to train its AI models Llama 1 and Llama 2. The 37GB dataset includes a substantial number of copyrighted works. However, Meta claims it had no obligation to seek “consent, permission, or payment” for this action and argues that its reproduction of copyrighted material should be considered “fair use.”

Meta’s stance has sparked widespread attention and debate within the copyright community. The company asserts that using such data to train AI models advances scientific progress and serves the public interest, thereby aligning with the principles of “fair use.” Nonetheless, this argument may face legal challenges, as “fair use” is interpreted and applied differently across jurisdictions. It typically involves weighing four main factors: the purpose of the use, the nature of the work, the quantity used, and the impact on the potential market.

While the Books3 dataset itself does not violate copyright law, the inclusion of copyrighted works for commercial purposes without the consent of the original authors or copyright holders could potentially infringe on copyright fundamentals. This incident not only exposes Meta to legal risks but could also have far-reaching implications for the tech industry, prompting a reevaluation of copyright issues and the boundaries of “fair use” in AI development.

As AI technology rapidly evolves, striking a balance between respecting intellectual property and fostering technological innovation has emerged as a pressing issue for both the industry and legal sectors. Meta’s case is likely to provide a vivid case study for future copyright law practice and discussions.

【来源】https://www.techspot.com/news/101507-meta-admits-using-pirated-books-train-ai-but.html