Occurred: October 2023
Former Arkansas Governor Mike Huckabee and a group of religious authors are suing Meta, Microsoft, Bloomberg, and EleutherAI for using their books to train their large language models without their knowledge or consent.
The lawsuit centres on the Books3 AI training dataset of 180,000 works, which as part of EleutherAI's larger dataset The Pile, has been used to train multiple large language models. Books3 was taken offline in August 2023 following a complaint about copyright abuse by Danish anti-piracy group Rights Alliance.
Huckabee's suit argues that Meta, Microsoft and Bloomberg 'were able to incorporate sophisticated datasets, which included the pirated copyright-protected materials in Books3, as part of the LLM’s training process, without having to compensate the authors.'
Operator: Bloomberg; EleutherAI; Meta; Microsoft
Developer: Bloomberg; EleutherAI; Meta; Microsoft
Purpose: Train language models
Transparency: Governance; Marketing
Published: October 2023