DeepSeek’s open source, efficient AI model puts the billions spent to build similar technology in question.
Two headlines this week almost perfectly captured the weird moment the AI field is barreling toward. First, news broke that OpenAI, Oracle, Softbank, and others plan to invest $500 billion in AI infrastructure via a new, Trump-blessed initiative called Stargate. And then, Chinese startup DeepSeek released r1, an inexpensive, open source, and capable reasoning model.
DeepSeek r1 currently sits in a tie for third place on ChatBot Arena, a ranking of the best large language models, ahead of OpenAI’s o1-preview, xAI’s Grok, and anything Anthropic’s ever built. Its ascent defines a moment where efficient, cheaper models are starting to rival those trained with billions of dollars. And now, it’s becoming impossible not to question whether all that spending is worthwhile if a few million dollars can do a good enough job.
“If the training costs for the new DeepSeek models are even close to correct, it feels like Stargate might be getting ready to fight the last war. Like bringing an M1 Abrams MBT to a drone fight,” venture capitalist Jeremy Liew said on X early on Friday.
The DeepSeek story, covered by the New York Times and others this week, is one of the more fascinating AI developments in recent memory. A Chinese quantitative stock trading company built the AI startup after using its profits to purchase thousands of NVIDIA chips in the early 2020s. Now, its technology is rivaling top western models, even with the constraints on chips it can use due to trade policy limitations. The Times story also mentioned another new model, Sky-T1, that a Berkeley professor and his students built on top of Alibaba open source technology that reportedly rivals OpenAI’s o1 model on some benchmarks.
To be sure, there’s some rumbling that DeepSeek’s results may not be all they seem to be. Google DeepMind CEO Demis Hassabis told me in an interview conducted last week that the company may have relied on western systems, or used other open source models as a starting point. Others have suggested the company has more NVIDIA chips than it lets on. So the story will continue to evolve. Still, Hassabis acknowledged DeepSeek’s work is impressive and “China is very, very capable at engineering and scaling.”
Judging by Silicon Valley’s early reaction to DeepSeek r1, the technology seems real enough. Venture capitalist Marc Andreessen of called it “one of the most amazing and impressive breakthroughs I’ve ever seen — and as open source, a profound gift to the world.” And his Andreessen Horowitz colleague Anjney Midha said “from Stanford to MIT, Deepseek r1 has become the model of choice for America’s top university researchers basically overnight.”
In a since-deleted tweet, Perplexity CEO Aravind Srinivas strongly praised the DeepSeek model. “More the narrative that China are copycats, more we shoot ourselves in the foot,” he said. “Deepseek is two orders of magnitude more efficient with capital allocation than OpenAI.” Srinivas did not reply to a request for comment.
If DeepSeek is indeed as capable as many are claiming, there will be some serious questions about whether the billions of dollars companies like OpenAI, Anthropic, and xAI spent to get to this point were worthwhile. Jim Fan, a senior research manager at NVIDIA, said on X that DeepSeek’s emergence is “a humbling wake-up call to us all that open science has no boundary.”
Addressing the financial investment, Fan said the existing compute infrastructure should produce even more powerful results with DeepSeek’s development. “We shall get 10x more powerful AI with the compute we have today and are building tomorrow. Simple math! The AI timeline just got compressed.”
Fan’s view is the most optimistic scenario though, especially if limitations like a potential data wall hinders generative AI’s progress. Ultimately, if companies can replicate what you do for billions of dollars using a few million, that suggests money inefficiently spent, as Srinivas points out.
Still, in pursuit of AGI, today’s leading research houses will push forward with even bigger and more expensive data centers, projects that were already drama laden without the DeepSeek variable. To get Stargate off the ground, OpenAI had to adjust its deal with Microsoft and then spent the week fighting with Elon Musk. Soon, we’ll see if it’s worth it, or if the DeepSeeks of the world take over.
As VC Sam Lessin put it earlier this week, most inventors believe OpenAI is either going to zero or infinity. That’s perhaps never felt more true than this week.
This article was originally published in Big Technology and is republished here with permission.