Occurred: March 2024
Report incident ๐ฅ | Improve page ๐ | Access database ๐ข
GPU chip provider NVIDIa has been sued by 3 authors accusing it of training it's NeMo AI models on copyrighted books.
Authors Brian Keene, Abdi Nazemian and Stewart O'Nan submitted a class action lawsuit against NVIDIA for copyright infringement, saying their works were part of the Books3 dataset and were trained on NeMO generative AI platform without their permission.ย
The Books3 dataset, the lawsuit argued, copied "all of Bibliotek" - a so-called shadow library of approximately 196,640 pirated books that had earlier been available as part of The Pile - a larger dataset - through AI community Hugging Face.ย
The Pile was later removed from Hugging Face in the wake of a copyright complaint.
The authors want compensation for their creative labour and the destruction of all copies of the Books3 dataset, and argue that NVIDIAโs October 2023 takedown of the NeMo AI platform was an implicit admission of its guilt.ย
The case highlighted ongoing copyright clashes between the AI industry and creative communities, with transparency and infringement claims at the forefront.
โ October 2023. Nvidia withdrew the NeMo platform and acknowledged the model had been trained on a dataset containing "approximately" 196,640 books. The Books3 dataset contains the same number of books.
Fair use
Fair use is a doctrine in United States law that permits limited use of copyrighted material without having to first acquire permission from the copyright holder.
Source: Wikipedia ๐
Nvidia NeMo ๐
Operator: Nvidia
Developer: Nvidia
Country: USA
Sector: Media/entertainment/sports/arts
Purpose: Train and deploy custom LLMs
Technology: Generative AI; Machine learning; Neural network; Deep learning; NLP/text analysis
Issue: Accountability; Copyright; Ethics/values; Transparency
https://dig.watch/updates/authors-take-legal-action-against-nvidia-for-copyright-infringement-in-ai-training
Page info
Type: Incident
Published: April 2024
Last updated: June 2024