- Deep Research Global
- Posts
- DeepSeek R2 Rumored With 97% Cost Cut vs GPT-4 Using Huawei Chips
DeepSeek R2 Rumored With 97% Cost Cut vs GPT-4 Using Huawei Chips
The AI world is abuzz with fresh rumors about a potential game-changing new model from Chinese AI startup DeepSeek.
Industry insiders and social media posts suggest the company's upcoming R2 model could dramatically reshape the economics of enterprise AI deployment while showcasing China's growing technological self-sufficiency.
Let's dive into what we know so far about this potentially disruptive development.
The DeepSeek R2 Rumors Emerge
Over the weekend, speculation intensified about DeepSeek's follow-up to its January 2025 release of the R1 reasoning model.
According to multiple tech news outlets, the upcoming R2 model reportedly features a hybrid Mixture of Experts (MoE) architecture and will be significantly more cost-effective than leading competitors like OpenAI's GPT-4^1,^2.
What's particularly notable is that this new model has allegedly been trained almost entirely on Huawei's Ascend 910B chips rather than the Nvidia GPUs $NVDA ( ▲ 2.68% ) that dominate most AI training setups^1.
This would represent a significant milestone in China's push for technological self-sufficiency amid ongoing US export restrictions.
Key Rumored Specifications
- 1.2 trillion parameters (78B active) using hybrid MoE architecture
- 5.2 petabytes of training data
- 89.7% score on C-Eval 2.0 benchmark
- 92.4% on COCO vision benchmark
- 82% utilization of Huawei Ascend 910B chip cluster
- 512 PetaFLOPS of FP16 precision computing power
The most striking claim?
DeepSeek R2 will reportedly cost a mere 2.7% of what GPT-4 charges for enterprise use, with prices of $0.07 per million input tokens and $0.27 per million output tokens^1,^2,^3.
Cost Revolution in AI
If the rumors prove accurate, the economic implications could be substantial. The current enterprise AI landscape is dominated by models with high usage costs, limiting adoption across industries.
Model | Cost per 1M Input Tokens | Cost per 1M Output Tokens |
---|---|---|
DeepSeek R2 (rumored) | $0.07 | $0.27 |
GPT-4 | ~$2.58 | ~$10.00 |
Cost Reduction | 97.3% | 97.3% |
For perspective, DeepSeek's previous R1 model was already significantly more cost-efficient than competitors, costing approximately $0.14 per million tokens compared to ChatGPT o1's $7.50-a 50x difference^5.
R2 appears poised to push this cost advantage even further.
The Huawei Connection
One of the most intriguing aspects of these rumors is the claim that DeepSeek R2 has been trained predominantly on Huawei's Ascend 910B chips. This represents a major shift from the industry's heavy reliance on Nvidia's GPUs for AI model training^1,^2.
Huawei has been aggressively developing its AI chip capabilities, with plans to produce around 300,000 Ascend 910B processors this year^8.
The company is also reportedly preparing to test its newer Ascend 910D chip, which it hopes will rival Nvidia's H100 series^6,^7.
Huawei's 2025 AI Chip Production Goals:
- 100,000 Ascend 910C chips
- 300,000 Ascend 910B chips
- Testing of new Ascend 910D chips starting in late May
- Recently unveiled Ascend 920 (900+ teraflops BF16 performance)
The Ascend 910B chip, while not matching Nvidia's leading offerings in raw performance, has shown impressive progress.
Analysis suggests that the performance increase from Huawei's first-generation Ascend 910 to the second-generation 910B series was approximately 75% of what was officially claimed, still representing a substantial improvement despite export controls^9.
Technical Innovation: Beyond Parameter Scale
What makes DeepSeek's approach particularly interesting is its apparent focus on architectural efficiency rather than simply scaling up parameters. While many AI labs have pursued larger and larger models, the rumored R2 architecture suggests a different approach.
The hybrid MoE (Mixture of Experts) architecture reportedly used in R2 allows the model to activate only a small portion of its parameters during operation-approximately 6.5% (78 billion out of 1.2 trillion)^1,^12.
This selective activation significantly improves computational efficiency and reduces operating costs.
How MoE Architecture Works
Traditional transformer models: Activate all parameters for every input
MoE architecture: Routes inputs to specialized "expert" sub-networks
Hybrid MoE (rumored in R2): Combines MoE with dense layers and advanced gating mechanisms for optimal resource allocation
This approach is particularly advantageous for processing lengthy documents in sectors like finance, legal, and healthcare, where cost-efficiency in handling large amounts of text is crucial^12.
Market Implications
The potential impact of DeepSeek R2 extends beyond just technical specifications. If the cost claims prove accurate, this could significantly disrupt the AI service market by:
Lowering adoption barriers for small and medium-sized businesses
Expanding the developer ecosystem around affordable AI APIs
Potentially forcing established providers like OpenAI and Google to reconsider their pricing strategies
For investors, this development suggests a possible shift in the competitive landscape of AI.
The massive cost differential could accelerate AI adoption in markets and use cases previously considered unprofitable, potentially creating new investment opportunities in AI implementation rather than just model development^16.
Reasons for Skepticism
Despite the exciting potential, there are several reasons to approach these rumors with caution.
Some industry observers have noted that the rumors originated primarily from Chinese finance and stock trading forums rather than from technical insiders with verified credentials^4.
There are also questions about whether the claimed timeline is realistic.
Training a 1.2 trillion parameter model since DeepSeek's last release would be extremely challenging even for well-resourced labs, and the claimed 5.2 petabytes of training data is unusually large compared to what most leading AI labs report using^4.
Red Flags About the DeepSeek R2 Rumors |
---|
Originated from stock trading forums rather than technical sources |
Potentially unrealistic timeline for training such a large model |
Unusually large claimed training dataset (5.2 petabytes) |
Some reports of DeepSeek officially denying these rumors as fake news |
The Bigger Picture: US-China Tech Competition
The DeepSeek R2 rumors emerge against the backdrop of intensifying technological competition between the United States and China, particularly in the semiconductor and AI sectors.
Earlier this month, the US government further tightened restrictions on the export of specific AI chips to China, including Nvidia's H20 model that was specifically designed for the Chinese market in compliance with previous export controls^15.
This has accelerated China's push for self-sufficiency in advanced semiconductors.
The rumors around DeepSeek R2 suggest that Chinese companies may be making significant progress in creating domestic alternatives to restricted US technology.
If verified, this would represent an important step in China's pursuit of AI sovereignty^4.
Looking Forward
Whether or not the specific claims about DeepSeek R2 prove accurate, the broader trends they highlight are worth watching.
The intersection of more efficient AI architectures, domestic Chinese chip development, and dramatically lower operating costs could significantly reshape the AI landscape.
For enterprise customers, these developments suggest keeping a close eye on emerging providers and models that may offer comparable performance to established options at a fraction of the cost.
For investors, the potential disruption to current pricing models may create both risks for incumbents and opportunities in new market segments.
As with any unconfirmed rumors, it's important to wait for official announcements and independent verification of performance claims.
Various sources indicate that DeepSeek R2 might be officially released in early May or in the weeks thereafter^3, at which point we'll be able to assess how much of the current speculation proves accurate.
Final Thoughts
DeepSeek R2 Key Points (Rumored):
- Hybrid MoE architecture with 1.2 trillion parameters (78B active)
- 97.3% cost reduction compared to GPT-4
- Trained primarily on Huawei Ascend 910B chips
- Part of China's growing AI self-sufficiency trend
- Potential release in May 2025
- Some skepticism around claims' authenticity
The AI industry continues to evolve at breakneck speed, with potential disruptions coming from unexpected directions.
If the DeepSeek R2 rumors prove even partially accurate, they may signal an important shift toward more cost-efficient AI that could accelerate adoption across industries and geographies.
Reply