Home News > DeepSeek AI's Low-Cost Models Suspected to Use OpenAI Data, Sparks Online Irony

DeepSeek AI's Low-Cost Models Suspected to Use OpenAI Data, Sparks Online Irony

by Bella Apr 20,2025

The emergence of DeepSeek AI, a cost-effective AI model from China, has sparked significant controversy and financial turbulence in the U.S. tech industry. Suspicions have arisen that DeepSeek may have utilized OpenAI's data to develop its models, a practice known as distillation, which violates OpenAI's terms of service. This week, former President Donald Trump labeled DeepSeek as a "wake-up call" for the U.S. tech sector, following a drastic $600 billion drop in Nvidia's market value due to a 16.86% plunge in its stock price, marking the largest loss in Wall Street history. Other tech giants like Microsoft, Meta Platforms, and Google's parent company Alphabet also saw declines ranging from 2.1% to 4.2%, while AI server maker Dell Technologies fell by 8.7%.

DeepSeek's R1 model, built on the open-source DeepSeek-V3, boasts significantly lower computing requirements and a reported training cost of just $6 million, positioning it as a formidable competitor to Western AI offerings like ChatGPT. Despite some disputes over these claims, DeepSeek's impact has led to a questioning of the massive investments U.S. tech companies are making in AI, causing investor unease. The model's popularity surged, propelling it to the top of the U.S. free app download charts amid discussions about its effectiveness.

In response to these developments, OpenAI and Microsoft are investigating whether DeepSeek used OpenAI's API to integrate its models, a move that OpenAI considers a violation of its intellectual property. OpenAI has emphasized its commitment to protecting its IP and is collaborating with the U.S. government to safeguard its technology from adversarial efforts.

David Sacks, Trump's AI czar, highlighted the evidence suggesting DeepSeek distilled knowledge from OpenAI's models, predicting that leading U.S. AI companies would soon take steps to prevent such practices. However, the situation is not without irony, as OpenAI itself has been accused of using copyrighted internet content to train ChatGPT. In January 2024, OpenAI admitted that training AI models like ChatGPT without copyrighted material was "impossible," a stance that has fueled debates over the ethics and legality of AI training data.

The controversy over AI training data has intensified, with the New York Times suing OpenAI and Microsoft in December 2023 for the "unlawful use" of its work, a claim OpenAI dismissed as "without merit," asserting that such training falls under "fair use." Additionally, a group of 17 authors, including George R. R. Martin, filed a lawsuit in September 2023, alleging "systematic theft on a mass scale." These legal battles underscore the contentious nature of AI development and the ongoing struggle to balance innovation with intellectual property rights.