China’s regulators have quietly delivered the most decisive blow yet in the AI chip war: new data-center procurement rules effectively ban U.S.-made GPUs for inference workloads across government agencies, state-owned enterprises, and any company receiving state cloud contracts.
Starting immediately, new clusters must run on domestic silicon, primarily Huawei’s Ascend 910B/910C series and Cambricon’s MLU series. The directive is not public, but the effect is absolute: any bid containing Nvidia, AMD, or Intel inference hardware is now automatically disqualified.
The pain is real and immediate. ByteDance, which had stockpiled tens of thousands of Nvidia H20 cards (the only inference-legal chip still allowed under current U.S. export controls) in anticipation of a total Trump-era embargo, is now sitting on inventory it cannot legally deploy in new builds.
Alibaba and Tencent have been forced to rip and replace entire inference layers that were already designed around Nvidia’s TensorRT-LLM stack. Engineers privately describe Huawei’s software ecosystem as “two to three years behind” and Cambricon’s drivers as “barely production-ready,” yet public earnings calls are filled with glowing praise for “sovereign computing achievements.”
The numbers tell the story. Nvidia’s China revenue (once 20–25 % of its total) collapsed 63 % year-over-year in its most recent quarter. Sales of the H20, the deliberately crippled “China-special” card that launched at $12,000–$15,000 per unit, have plummeted to near-zero as customers cancel orders rather than build on hardware that will soon be illegal to use. AMD and Intel’s inference-focused Instinct and Gaudi lines have fared even worse.
Beijing’s calculus is brutally pragmatic. Training the absolute frontier model is a prestige race China is unlikely to win in the next 2–3 years; the raw H100/H200/Blackwell density gap is simply too large. But inference is different. Inference is 70–80 % of real-world AI compute spend, runs on far less cutting-edge silicon, and is where domestic vendors are already within striking distance.
By ring-fencing the world’s largest inference market (China consumes roughly 40 % of global AI inference cycles), regulators guarantee hundreds of billions in guaranteed demand for Huawei and Cambricon, enough to fund multiple generations of catch-up.
The results are already visible:
- Huawei’s Ascend division reportedly shipped over 1.2 million 910B-equivalent chips in the first three quarters of this year alone.
- Cambricon’s MLU370 and upcoming MLU390 series have secured “preferred vendor” status in every major Chinese cloud provider.
- Domestic cloud prices for inference have paradoxically dropped 15–25 % despite lower raw performance, because local vendors are subsidized and desperate to gain share.
For American policymakers, the irony is bitter. The original export controls were designed to starve China of training compute; instead, they have handed Beijing a protected inference monopoly worth far more in economic terms. There is now open discussion in Washington about loosening restrictions on older-generation inference chips (H200, MI300X) to keep at least some market access, an outcome that would have been unthinkable twelve months ago.
Also read:
- China’s LingGuang Just Made “Vibe-Coding” Go Mainstream Overnight
- Humanoids on the Horizon: China's Walker S2 Robots Usher in an Era of Tireless Border Guardians and Factory Sentinels
- Beijing Sounds the Alarm: China’s Own Regulators Warn of a Humanoid-Robot Bubble
Meanwhile, Chinese giants are accelerating the transition whether they like it or not. Alibaba Cloud has already migrated 60 % of its Model-as-a-Service inference to Ascend clusters. Tencent claims its mixed Huawei–Cambricon fleet now serves 90 % of Hunyuan traffic. Baidu’s Ernie Bot inference layer is reportedly 100 % domestic silicon.
The great decoupling of AI infrastructure is no longer theoretical. China has accepted a short-term performance hit in exchange for long-term independence, and the inference market, once Nvidia’s most profitable segment in China, has effectively ceased to exist for American vendors.
The chip war just entered its endgame, and the first major territory has already changed hands.
Author: Slava Vasipenok
Founder and CEO of QUASA (quasa.io) — the world's first remote work platform with payments in cryptocurrency.
Innovative entrepreneur with over 20 years of experience in IT, fintech, and blockchain. Specializes in decentralized solutions for freelancing, helping to overcome the barriers of traditional finance, especially in developing regions.

