Treatise · The History of AI in China
The story of Chinese AI is a journey from catching up, to running alongside, to leading in places. It began in the planned-economy mood of the 1980s, grew up inside the search and advertising businesses of BAT, exploded after 2014 with the Four Little Dragons of Computer Vision and the Six AI Tigers of large models, and in January 2025 — at the DeepSeek R1 moment — the world for the first time looked seriously at what "Chinese-style AI" could be. Hemmed in on one side by compute restrictions and on the other by institutional differences, it has charted a path that runs parallel to America's but in a distinctly different style.
I. Origins: The 863 Program and the "Senior School" (1986–2000)
In March 1986, four senior scientists — Wang Daheng, Wang Ganchang, Chen Fangyun, and Yang Jiachi — jointly petitioned Beijing to accelerate high-technology development. That same month, Deng Xiaoping signed the petition into action, and the "863 Program" — China's national plan for high-technology research and development — was born. Intelligent computing (Theme 306) was one of its pillars; from then on, AI became part of the national agenda.
That generation, later known as the "Senior School," planted the first seeds. Zhang Bo (张钹) of Tsinghua University led intelligent-robotics research from the early 1980s, advancing his "quotient space" theory; Dai Ruwei and Lu Ruqian of the Chinese Academy of Sciences' Institute of Automation laid foundations in pattern recognition and knowledge representation respectively; Fang Binxing and Li Sheng of Harbin Institute of Technology cultivated Chinese-language information processing and speech recognition; and Zhihua Zhou (周志华) of Nanjing University would, in the 2000s and after, become one of the most internationally recognized Chinese AI scholars through ensemble-learning research and his widely used machine-learning textbook.
These were years of "thin budgets, scarce equipment, and rare opportunities to travel abroad," yet they established a tradition that endures: the womb of Chinese AI lies in universities and national laboratories, not in companies.
II. The 1990s: Departments Take Root
Between 1996 and 2000, China's leading universities established standalone computer science departments or AI-focused programs. Tsinghua had founded its State Key Laboratory of Intelligent Technology and Systems within the Department of Computer Science back in 1985; the Institute of Computing Technology and the Institute of Automation of CAS continued to lead symbolic AI and pattern recognition; Harbin Institute of Technology — which had founded a computing major as early as 1956 — was the cradle of Chinese-language speech recognition; Peking University, Fudan, Shanghai Jiao Tong, Zhejiang, and Xi'an Jiao Tong each formed their own AI groups.
Zhou Zhihua's Nanjing team broke into the international community in the late 2000s through papers in ICML, NeurIPS, and AAAI; his Machine Learning textbook (popularly the "Watermelon Book," 2016) thereafter educated an entire generation of Chinese ML beginners. During this same decade, Kai-Fu Lee (1961–) ran Microsoft Research Asia (MSRA, founded in Beijing in 1998), and almost single-handedly trained the future core engineers of BAT and the Four Little Dragons of Computer Vision: Tang Xiaoou, Sun Jian, Kaiming He (1984–), Wang Jian, and Lin Bin all came out of MSRA. The industry calls MSRA "China's West Point of AI" — and the name fits.
III. Early Industry: iFlytek, Hanwang, Hikvision
China's earliest examples of AI commercialization came from three undervalued companies.
In 1999, Liu Qingfeng led eighteen graduate students from the speech lab at the University of Science and Technology of China to found iFlytek in Hefei, focused on Chinese speech synthesis and recognition. After two decades of polishing in vertical markets such as English-language testing in K-12 schools and government meeting transcription, it listed on the Shenzhen Stock Exchange in 2008 — China's first publicly listed company with AI as its core business. Contemporaries Hanwang Technology built handwriting recognition and OCR, briefly topping the A-share market with its e-readers in 2010; Hikvision, starting from analog video surveillance, used the deep-learning wave after 2014 to transform into one of the world's largest video AI vendors.
These three companies lacked the "Silicon Valley narrative" sheen, but they built up China's first batches of engineering experience, customer contracts, and government relationships — three things harder to replicate than the model itself.
IV. The BAT Era: AI Inside Search, Commerce, and Social
If America's industrial AI womb is Google, China's is BAT.
Baidu hired Andrew Ng (1976–) as Chief Scientist in 2014, opening a deep-learning lab (IDL) in Silicon Valley. Ng left in 2017, but IDL produced and exported a cohort of leading talent — Wang Jin, Yu Kai, Lin Yuanqing, Lin Bin, and others — who later became core figures at Horizon Robotics, Pony.ai, the Beijing Academy of AI (BAAI), and Mobvoi. Baidu itself developed search, maps, the Apollo autonomous-driving program, and the ERNIE/Wenxin large-model series, culminating in the 2023 launch of "Wenxin Yiyan."
Alibaba's main AI force is DAMO Academy (founded 2017) and the Tongyi Lab. From 2023, the Qwen series rose as "the most actively open-source major-cloud model from China," with Qwen2, Qwen2.5, and Qwen3 leading Hugging Face's monthly download charts in 2024–2025 alongside Meta's Llama.
Tencent's Youtu Lab focuses on computer vision, while AI Lab covers speech, NLP, and game AI; WeChat's smart-customer-service and QQ Browser's content understanding are early deployment grounds. Tencent released its Hunyuan large model in 2023 and open-sourced Hunyuan-Large in 2024.
The shared trait of the three is: business is data; data is fuel. They did not bet on AGI like OpenAI; they made AI the water and electricity of everyday applications.
V. The Four Little Dragons of Computer Vision: Boom, IPO Wave, and Recession (2014–2024)
The years 2014 to 2018 were China's "vision era."
SenseTime (founded 2014 in Hong Kong by Tang Xiaoou and Xu Li), Megvii (founded 2011 in Beijing by Yin Qi, Tang Wenbin, and Yang Mu), YITU (founded 2012 in Shanghai by Zhu Long and Lin Chenxi), and CloudWalk (founded 2015 by Zhou Xi, spun out from CAS Chongqing Institute) — capital dubbed these four the "Four Little Dragons of Computer Vision." Within four years they collectively raised more than ten billion dollars; each peaked above ten-billion-dollar valuations; their products spread across face recognition, security, financial-identity verification, and "city brains."
From 2019 onward, the ceiling of this track became visible. The U.S. Entity List added SenseTime, Megvii, YITU, and others, tightening overseas compute and IPO routes. The 2021–2022 wave of Hong Kong listings completed precariously (SenseTime listed on HKEX in 2021), but stock prices and valuations dropped sharply. From 2023 to 2024, as large models rose, the pure-vision paradigm was further marginalized; "Four Little Dragons" gradually faded as a collective concept, and the four companies diverged — SenseTime bet on the "large device" platform, Megvii held to IoT vision, YITU contracted, and CloudWalk pivoted to industry-specific large models.
This was a complete industry cycle. What it left behind is not merely lessons but the first sizable cohort of Chinese engineers with deep training-pipeline experience.
VI. The Era of Large Models: China's Six AI Tigers
From the second half of 2023, Chinese large-model startups exploded. The "Six AI Tigers," as media named them, were:
- Zhipu AI (founded 2019 from Tsinghua), with the GLM series; valued at roughly twenty billion RMB in 2024 and continuing to push GLM-5 in 2025.
- Moonshot AI (founded 2023 by Yang Zhilin (杨植麟)), with the Kimi series, opening the consumer market with a record-long 2-million-character context; for a time its C-end user growth outpaced every domestic competitor.
- MiniMax (founded 2021 by Yan Junjie), with Hailuo AI and the abab series, running B-end APIs and the international "Talkie" app in parallel.
- Baichuan AI (founded 2023 by Wang Xiaochuan), whose Baichuan series — distributed open-source — quickly spread through academia.
- 01.AI / Lingyi Wanwu (founded 2023 by Kai-Fu Lee (1961–)), with the Yi series briefly matching Llama 2 70B on English benchmarks.
- StepFun (founded 2023 by Jiang Daxin), with the Step series focused on multimodality.
These six companies together raised more than ten billion dollars in 2023–2024. The shared profile: founders mostly with BAT or MSRA backgrounds; valuations between twenty and fifty billion RMB; limited international press coverage. But they formed China's first batch of "startup AI" specifically devoted to large models.
VII. ByteDance, Alibaba, and the Rise of DeepSeek (2024–2025)
From 2024, two new forces upended the large-model landscape.
ByteDance came late but came hard with Doubao. Riding Douyin's traffic, Doubao became the largest MAU AI assistant in China by the second half of 2024; its Seedance and Seed1.5 models were sold bundled with Volcano Engine compute. Alibaba's Tongyi consolidated the developer base via the open-source approach above. Tencent embedded Hunyuan into WeChat, QQ, and WeCom.
But the company that truly shook the world was DeepSeek.
DeepSeek, a subsidiary of the High-Flyer quant hedge fund, was founded only in 2023. In December 2024 it released DeepSeek-V3 (671B parameters, MoE), claiming a training cost of just $5.58 million — a number that drew widespread skepticism and curiosity overseas. In January 2025, DeepSeek-R1 was released and open-sourced, with reasoning performance matching OpenAI's o1 across multiple benchmarks. The "DeepSeek moment" arrived: NVIDIA's stock saw its largest single-day drop in history; Silicon Valley's leading firms held emergency post-mortems. The significance of R1 was not only performance but paradigm: achieving near-H100 training efficiency on H800s under compute restriction, and releasing all weights and training details under the MIT license — putting open-source large models back on the world stage.
In the second half of 2025, DeepSeek-V3.2 and the R2 series followed, and Time named DeepSeek among "the most influential AI companies of the year."
VIII. U.S.–China Decoupling: Entity List and Compute Lockdown
Since 2018, U.S. restrictions on Chinese AI have escalated.
In October 2019, the Department of Commerce first added 28 Chinese entities to the Entity List, including SenseTime, Megvii, YITU, Hikvision, and iFlytek. In October 2022, BIS issued advanced-semiconductor export controls — banning the A100 and H100. In October 2023, the rules tightened, closing the loophole on the China-only A800 and H800. In December 2024, the rules expanded to HBM and certain wafer-fab equipment. Under the second Trump administration in 2025, the framework continued and intensified, layering global compute flow under the AI Diffusion Rule's tiered controls.
China's response is two-track: on one side, domestic chips — Huawei Ascend 910B/910C, Cambricon MLU 590, Moore Threads MTT S4000, Hygon, Enflame, Biren — fill the gap; on the other, companies like DeepSeek prove that with constrained compute one can still approach top-tier model performance. The short-term pain is real; the long-term verdict has not yet been written.
IX. Domestic Compute Infrastructure: From "Choke Point" to "New Stack"
Between 2024 and 2026, the domestic AI compute stack entered a substantive phase.
Huawei Ascend 910C, released in 2024, reached roughly sixty percent of H100's per-card performance, and its CloudMatrix 384 cluster was marketed as "system-versus-card." Cambricon turned its first profit in 2024, with MLU 590 shipping in volume to Baidu and ByteDance. Moore Threads transitioned from gaming GPUs to large-model training and passed STAR-Market review in 2025. Sugon and Hygon — the "CAS family" — took on dual workloads of supercomputing and AI servers.
On the software stack, PaddlePaddle (Baidu), MindSpore (Huawei), and OneFlow (acquired by ByteDance) seek to substitute for PyTorch. On the model layer, Qwen, DeepSeek, GLM, InternLM, and Kimi are released in parallel on Hugging Face and ModelScope. An independent — but not yet self-sufficient — Chinese AI stack is taking shape.
X. Twin-Track Drive: Policy and Market
The greatest distinguishing feature of Chinese AI is that it was never purely a market story.
In 2017 the State Council issued the New Generation AI Development Plan, setting a "three-step" strategy with the goal of becoming a leading global AI innovation center by 2030. In 2024, "AI+" entered the Government Work Report for the first time; that same year, Beijing, Shanghai, Shenzhen, Hefei, Hangzhou, and Chengdu all launched local large-model subsidies and "compute vouchers." The local-government "compute voucher" model is, in essence, GPU time issued as industrial-policy instrument to startups.
This policy-and-market twin track has produced China's speed, but also its distinctive risks: valuations rising and falling with policy cycles, resources concentrating excessively at top firms, and a relative underinvestment in long-term basic research. Yet by 2026, Chinese AI is no longer a "follower behind America" — it is a parallel system with its own technology stack, its own narrative, and its own market.
Historian's Note
I have observed forty years of Chinese AI and feel its weight. When the four elders petitioned Deng in 1986, the country had hardly a usable computer; today, when DeepSeek launches a model, NASDAQ shudders. This is not the work of a single day but the harvest of generations. And the path differs deeply from America's: American AI begins in the academy, grows in capital, and ripens in the market; Chinese AI begins in the state, grows inside business, and ripens on the twin track of market and policy. BAT, through search and commerce, mass-trained AI engineers as if conscripting an army; the Four Little Dragons of Computer Vision pushed vision models to their engineering limits through security and finance; the Six Tigers and DeepSeek, through open-sourcing and ruthless compute optimization, inscribed "Chinese-style efficiency" into the world's published papers. There are also hidden anxieties: valuations swing with policy, research follows product cadence, and long-cycle, low-return, high-risk basic work attracts few. Sima Qian wrote of "tracing the changes through past and present" — that change is now before our eyes. The tighter the blockade, the fiercer the breakthrough; but blockades both forge surprise armies and wear down patience. The watershed between catching up and leading lies in whether China can, beyond the R1 moment forced out by pressure, also produce its own "Dartmouth meeting" — a group of people sitting down without instrumental purpose to ask a question that will only matter in a future they will not see.
Eyewitness Accounts
Call for contributions
If you have lived through this history in Chinese AI academia or industry, please contribute on GitHub.
References
- State Council of the People's Republic of China. (2017). New Generation Artificial Intelligence Development Plan. State Council Document No. 35 [2017].
- Zhou, Z. (2016). Machine Learning [in Chinese]. Tsinghua University Press.
- China Academy of Information and Communications Technology. (2024). White Paper on AI Development 2024.
- DeepSeek-AI. (2024). DeepSeek-V3 Technical Report. arXiv:2412.19437.
- DeepSeek-AI. (2025). DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv:2501.12948.
- Alibaba Qwen Team. (2024). Qwen2 Technical Report. arXiv:2407.10671.
- U.S. Department of Commerce, BIS. (2022—2024). Export Controls on Advanced Computing and Semiconductor Items.
- Lee, K.-F. (2018). AI Superpowers: China, Silicon Valley, and the New World Order. Houghton Mifflin Harcourt.