China Claims Huawei Trained 1.6T AI Model on Ascend

3 Minutes

Imagine a server room dense with silicon, each chip chipping away at a mountain of text. That’s the image Huawei’s research group is selling after announcing they trained DeepSeek V4-Pro, a 1.6 trillion-parameter model, using a cluster built around at least a thousand Ascend 910C chips.

The story sounds straightforward: domestically produced AI silicon finally handling large-scale model workloads. But the reality is layered. Huawei says the team performed full-parameter updates—meaning every weight in the model was trained rather than simply adding a thin adapter layer—and that pretraining for V4-Pro processed a staggering corpus reportedly exceeding 32 trillion tokens. Pretraining builds the model’s core capabilities; the later fine-tuning stage shapes behavior through instruction tuning and safety alignment.

Why does that matter? Because full-parameter training is far more demanding than light-touch techniques that tweak only a small portion of a network. It requires sustained throughput, stable interconnects, and tight orchestration across chips. Historically, Chinese groups struggled to migrate heavy training workloads off Nvidia hardware without hitting bottlenecks in performance and connection stability.

Huawei points to the Ascend 910C’s dual-design architecture as a turning point. Independent tests from earlier DeepSeek experiments suggested an Ascend part could deliver roughly 60% of the inference performance of Nvidia’s H100, but that was inference — not large-scale, synchronized training. Training workloads expose different weaknesses: collective communication, memory management, and software maturity all become decisive.

Still, the claim has caveats. The researchers reported completion of full-parameter training, but provided no rigorous benchmarks: no wall-clock time, no throughput metrics, no head-to-head comparison with H100 clusters, and no detailed breakdown of power or efficiency. Without those numbers, the announcement reads precisely like what it is—an encouraging technical milestone but not yet independent proof that Ascend clusters match or surpass established alternatives for leading-edge pretraining.

There’s precedent for caution. Earlier reports said attempts to train a different model, R2, on Huawei silicon ran into instability and slow chip interconnects. Moving from successful demonstrations in inference to reliable, large-scale pretraining is a big leap. Companies can sometimes stitch together enough engineering to complete a single run while still lacking the robustness required for routine model development at scale.

So what’s the takeaway for the wider AI ecosystem? If Huawei’s account holds up under scrutiny, it signals growing competitiveness of Chinese AI hardware and a maturing software stack capable of orchestrating thousand-chip training jobs. If it doesn’t, it underscores that hype still outpaces verifiable progress. Either way, the next step is clear: independent benchmarks and transparent runtime data.

We’ll be watching for those numbers. Independent verification will tell us whether this is a true pivot in global AI infrastructure or simply an ambitious proof-of-concept.

Comments

No comments yet.

China Claims Huawei Trained 1.6T AI Model on Ascend

Huawei researchers claim they trained DeepSeek V4-Pro, a 1.6 trillion-parameter model, on a cluster of at least 1,000 Ascend 910C chips. The announcement lacks independent benchmarks and detailed performance data.

Leave a Comment

Comments

Related Posts

Trump Threatens EU with Tariffs After $1B Google Fine

Galaxy Z Fold 8 Ultra Hides Two Pro Filmmaking Tools

Samsung Galaxy F70 Pro Surfaces on Geekbench Tests

MacBook Neo 2 Rumors: A19 Pro, 12GB RAM, Smarter Use

Google’s Gemini Spark AI Expands, But Not Everywhere

Google's Gemini Is Closing In On One Billion Users

Unitree's Wheeled Robot Hauls 16 kg and Roams 30 km

Apple Presses Suppliers to Cut iPhone 18 Pro Max OLED Costs

Galaxy Ring: First FDA-Cleared Smart Ring for Sleep Apnea

New Samsung Foldables Trade Battery Longevity for Power

Geekbench 7 Rewrites Benchmarks for Real-World Tasks

Samsung May Swap Sensors: S27 Series Testing Sony 50MP