Meta训练AI遭版权质疑

近日，社交媒体巨头Meta因使用“盗版”书籍训练人工智能模型而引发广泛关注。据悉，Meta在训练其大模型Llama 1和Llama 2时，使用了包括Books3在内的开源图书数据集。Books3是一个包含近20万本书籍的纯文本集合，总容量近37GB，这些书籍均受版权保护。

Meta在诉讼中承认使用了这些受版权保护的材料，但其辩称，使用这些材料进行人工智能训练属于“合理使用”范畴，因此无需获得版权持有者的同意、许可或支付费用。这一立场引发了版权持有者和公众的广泛争议。

版权持有者认为，Meta未经授权使用受版权保护的作品，侵犯了他们的合法权益。他们指出，即使是为了训练人工智能，也应该尊重版权法律，确保在合法合规的前提下使用这些材料。

Meta的这一做法也引发了关于人工智能训练材料版权问题的讨论。一方面，人工智能的发展需要大量的数据支持，而开源数据集提供了一种便捷的途径。另一方面，版权法律旨在保护创作者的劳动成果，防止未经授权的复制和分发。

目前，这一案件仍在审理中，其结果将对未来人工智能领域的版权问题产生深远影响。如果Meta的立场得到支持，可能会为科技公司使用受版权保护的材料提供新的法律依据。然而，这也可能引发版权持有者的进一步维权行动，从而影响人工智能技术的发展。

Title: Meta’s AI Training Under Copyright Scrutiny
Keywords: Copyright Dispute, Artificial Intelligence, Fair Use
News content:
Recently, social media giant Meta has come under scrutiny for using copyrighted books to train artificial intelligence models. The company admitted to using the Books3 dataset, among others, to train its large models, Llama 1 and Llama 2. Books3, a collection of nearly 200,000 books in pure text, totalling nearly 37GB, is protected by copyright.

Meta acknowledged using the copyrighted materials in a lawsuit but argued that their use falls under the “fair use” doctrine, thus not requiring permission, licensing, or payment from the copyright holders. This stance has sparked widespread controversy among copyright holders and the public.

Copyright holders argue that Meta’s unauthorized use of copyrighted works infringes on their legal rights. They contend that even for the purpose of training artificial intelligence, the law should be respected, ensuring that materials are used legally and ethically.

Meta’s approach has also ignited a debate on the copyright issues surrounding training materials for artificial intelligence. On one hand, the development of artificial intelligence requires vast amounts of data, and open-source datasets provide a convenient means to this end. On the other hand, copyright laws are designed to protect the fruits of creators’ labor and prevent unauthorized copying and distribution.

The case is still pending, and its outcome will have a profound impact on copyright issues in the field of artificial intelligence. If Meta’s position is upheld, it could provide new legal grounds for technology companies to use copyrighted materials. However, this could also lead to further copyright holders’ rights actions, affecting the development of artificial intelligence technology.

【来源】https://www.techspot.com/news/101507-meta-admits-using-pirated-books-train-ai-but.html