DeepSeek accused of using OpenAI models to train AI system
DeepSeek accused of using OpenAI models to train AI system
Occurred: January 2025
Report incident 🔥 | Improve page 💁 | Access database 🔢
Chinese AI startup DeepSeek has been accused by OpenAI of improperly using its proprietary models to train a competing AI system, potentially violating OpenAI's terms of service.
OpenAI alleges that DeepSeek used a technique called "distillation" to extract knowledge from OpenAI's models through its API, which involves using outputs from larger AI models to train smaller ones.
OpenAI says it has evidence to support its case. In August 2024, OpenAI and Microsoft investigated and blocked accounts for suspected terms of service violations, which they now believe were associated with DeepSeek.
Distillation is known to be common in the AI industry.
The controversy appears to have stemmed Deepeek's desire to move quickly and at relatively low cost to develop and release its language models.
The Chinese company is also seen to have used creative methods to circumvent US chip restrictions.
It is unclear whether DeepSeek will be held accountable for its alleged theft of OpenAI data.
More broadly, the controversy highlights the challenges of protecting proprietary AI models and the data used to train them, and raises questions about the sustainability of high-cost, closed, general purpose AI models.
Critics pointed out that OpenAI has also benefited from using others' data, and accused the US company of hypocrisy.
Knowledge distillation
In machine learning, knowledge distillation or model distillation is the process of transferring knowledge from a large model to a smaller one.
Source: Wikipedia 🔗
Operator:
Developer: DeepSeek Artificial Intelligence Co
Country: USA
Sector: Technology
Purpose: Train model
Technology: Generative AI; Large language model
Issue: Accountability; Cheating/plagiarism; Copyright; Transparency
https://www.ft.com/content/a0dfedd1-5255-4fa9-8ccc-1fe01de87ea6
https://www.newsweek.com/openai-warns-deepseek-distilled-ai-models-reports-2022802
https://www.bloomberg.com/news/articles/2025-01-29/microsoft-probing-if-deepseek-linked-group-improperly-obtained-openai-data
https://www.nbcnews.com/tech/tech-news/openai-says-deepseek-may-inapproriately-used-data-rcna189872
https://dig.watch/updates/white-house-expresses-alarm-over-deepseeks-ai-techniques
https://www.computing.co.uk/news/2025/ai/openai-deepseek-stealing-ip
Page info
Type: Issue
Published: February 2025