AI companies appropriate 139,000 TV, film scripts to train AI systems
Occurred: November 2024
Report incident 🔥 | Improve page 💁 | Access database 🔢
Over 139,000 television and film scripts were used to train artificial intelligence systems without the consent of their creators, a media investigation revealed, sparking outrage among writers and creatives.
What happened
A dataset compiled from subtitles sourced from OpenSubtitles.org has been used by high-profile companies, including Anthropic, Apple, Bloomberg, Meta, Nvidia and Salesforce, to train their AI models, according to The Atlantic writer and programmer Alex Reisner.
The dataset includes scripts from popular shows and films such as The Sopranos, Breaking Bad and The Simpsons, as well as works by writers such as Shonda Rhimes and Ryan Murphy.
Writers expressed shock that their work had been appropriated without acknowledgement or permission and voiced concern about the implications for their intellectual property rights.
Why it happened
The unathorised use of these scripts to help develop AI systems raises significant legal and ethical questions regarding copyright and fair use.
The current legal landscape around AI-generated content remains murky, with many developers claiming that their data is sourced from "open" materials.
However, this justification is increasingly under challenge by content creators and owners who argue that their works are being exploited without compensation or acknowledgment.
What it means
The discovery underscores ongoing tensions between technology companies and the rights of content creators.
Writers fear that such practices may undermine their livelihoods as AI systems become capable of generating content similar to human-created works, compete directly with them and put them out of business.
The ongoing backlash may lead to stronger copyright protections and clearer regulations governing the use of creative works in AI training.
Fair use
Fair use is a doctrine in United States law that permits limited use of copyrighted material without having to first acquire permission from the copyright holder.
Source: Wikipedia 🔗
System 🤖
Operator:
Developer: Anthropic, Apple, Bloomberg, Meta, Nvidia, Salesforce
Country: USA
Sector: Media/entertainment/sports/arts
Purpose: Generate text
Technology: Chatbot; Generative AI; Machine learning
Issue: Cheating/plagiarism; Copyright; Ethics/values; Transparency
Investigations, assessments, audits 👁️
Page info
Type: Incident
Published: November 2024