The affordability of DeepSeek is a myth: The revolutionary AI actually cost $1.6 billion to develop

Author : Caleb Mar 16,2025

DeepSeek's new chatbot boasts a surprisingly capable AI, introducing itself with the simple yet intriguing statement: "Hi, I was created so you can ask anything and get an answer that might even surprise you."

This AI has quickly become a major player, even causing significant drops in NVIDIA's stock price. Its success stems from a unique combination of architecture and training methods, incorporating several innovative technologies:

  • Multi-token Prediction (MTP): Unlike traditional word-by-word prediction, MTP forecasts multiple words simultaneously, significantly improving both accuracy and efficiency.
  • Mixture of Experts (MoE): This architecture leverages multiple neural networks to process input, accelerating training and boosting performance. DeepSeek V3 utilizes 256 networks, activating eight for each token.
  • Multi-head Latent Attention (MLA): MLA repeatedly focuses on key sentence components, ensuring crucial details aren't missed, leading to a more nuanced understanding of the input.

DeepSeek Test

Image: ensigame.com

DeepSeek, a subsidiary of the Chinese hedge fund High-Flyer, initially claimed to have trained its powerful DeepSeek V3 neural network for a mere $6 million using 2048 GPUs. However, SemiAnalysis revealed a far more substantial infrastructure: approximately 50,000 Nvidia Hopper GPUs, including 10,000 H800s, 10,000 H100s, and additional H20s, distributed across multiple data centers. This translates to a server investment of roughly $1.6 billion and operational expenses estimated at $944 million.

DeepSeek V3

Image: ensigame.com

Unlike many startups relying on cloud computing, DeepSeek owns its data centers, providing greater control and faster innovation implementation. Its self-funded status further enhances agility and decision-making. The company's commitment to talent is also notable, with some researchers earning over $1.3 million annually, attracting top graduates from leading Chinese universities. The initial $6 million figure, therefore, only reflects pre-training GPU costs and significantly underrepresents the total investment exceeding $500 million.

DeepSeek

Image: ensigame.com

While DeepSeek's lean structure allows for efficient innovation compared to larger, more bureaucratic companies, its success is undeniably linked to substantial investment, technological breakthroughs, and a highly skilled team. The "revolutionary budget" claim, therefore, needs significant qualification. Nevertheless, DeepSeek’s costs remain significantly lower than competitors. For example, DeepSeek spent $5 million on R1, compared to ChatGPT's $100 million for ChatGPT4o.

DeepSeek

Image: ensigame.com

DeepSeek's story highlights the potential of a well-funded, independent AI company to compete effectively with established giants. However, it also underscores the reality that substantial investment, cutting-edge technology, and exceptional talent are key ingredients for success in this rapidly evolving field.