Biography · Ilya Sutskever
He sat at the centre of three eras — AlexNet, Seq2Seq, and GPT — and at the centre, too, of the November 2023 weekend that held the world's breath. He says little; he does much.
From Nizhny Novgorod to Jerusalem to Toronto
Ilya Sutskever (1986–) was born in 1986 in Nizhny Novgorod (then Gorky), in the Soviet Union. When he was five, with the Soviet Union on the eve of collapse, his family emigrated to Israel.
He finished secondary school in Jerusalem, acquiring along the way the dual rigour of Hebrew and mathematics. Around the age of sixteen, the family moved again — this time to Toronto, Canada.
At the University of Toronto he completed a bachelor's and master's degree, and naturally enough walked into the small lab on campus that was about to change the world: the group of Geoffrey Hinton (1947–).
Through his doctorate from 2008 to 2012 he sat at the centre of those few who still believed deep neural networks would succeed. He has said in interview: "Most people thought neural networks were a dead end. We did not."
2012: An Eight-Layer Network That Woke the World
2012 was the turning point of computer vision. At that year's ImageNet Large Scale Visual Recognition Challenge (ILSVRC), AlexNet — supervised by Hinton, coded and implemented by Alex Krizhevsky (1986–), with Sutskever's help in algorithm design — won by a margin no one had seen before, dropping top-5 error from nearly 26 percent to 15.3 percent.
It was the first time a deep convolutional network had crushed every traditional method on a general visual benchmark. The mood of the entire field changed overnight before that error-rate bar chart.
Soon afterwards, the three founded a small company, DNNresearch — no product, just a few people and a few GPUs. Two months later, Google bought the entire company. Sutskever joined Google Brain.
2014: Teaching Machines to Translate
At Google Brain came the second great work. In 2014, with Oriol Vinyals and Quoc V. Le, Sutskever published "Sequence to Sequence Learning with Neural Networks" — soon known as Seq2Seq.
The core idea was plain and sharp: use one LSTM to read a source sentence into a vector, then use another LSTM to write that vector out into a target sentence.
It was the foundational paper of end-to-end neural machine translation. It led directly to the full neural overhaul of Google Translate in 2016, lifting "translation" out of the patchwork of phrase tables into something with the coherence of discourse.
Seq2Seq was later extended to dialogue, summarisation, code generation, and speech synthesis — it taught machines a general capacity to "take in a sequence and produce a sequence." Today's ChatGPT, in skeleton, descends from that old framework.
2015: Chief Scientist of OpenAI
In December 2015, Sam Altman (1985–), Elon Musk, and others co-founded OpenAI, declaring its mission to be "to advance digital intelligence in the way most likely to benefit humanity as a whole, unconstrained by financial obligations."
Sutskever gave up his salary and platform at Google Brain to join as co-founder and head of research, later officially Chief Scientist.
Inside OpenAI he steered the entire main road toward large models: GPT-1 in 2018 validated the route of unsupervised pre-training on vast text followed by supervised fine-tuning; GPT-2 in 2019 made model scale a public ethical concern for the first time; GPT-3 in 2020 pushed parameters to 175 billion and turned prompting into a new human-machine interface; ChatGPT at the end of 2022 and GPT-4 in 2023 carried "dialogue as interface" into nearly every industry.
At every key technical fork — whether to bet on scaling, whether to embrace RLHF, whether to go multimodal — his colleagues took Sutskever's judgement as anchor.
He is not a celebrity scientist. In public he speaks sparingly, often pausing for long stretches between sentences, as if listening to some inner echo.
In interviews a near-religious cadence sometimes surfaces: "We are creating something more intelligent than ourselves." This is not marketing rhetoric but a concern he has carried for many years.
November 2023: A Five-Day Coup
On the afternoon of 17 November 2023, OpenAI's board, without notifying the great majority of employees in advance, announced that Sam Altman (1985–) had been removed as chief executive. The board's wording was terse and severe: Altman had been "not consistently candid in his communications with the board." Soon after the statement appeared, public attention fixed on a single director — Sutskever.
The next five days were the most dramatic in the industry's history. Microsoft immediately announced it would absorb Altman into a new AI laboratory; over seven hundred OpenAI employees signed an open letter threatening collective resignation in support of him; investors, journalists, and regulators were all dragged into the storm.
On 19 November, Sutskever made an internal statement: "I deeply regret my participation in the board's actions. I never intended to harm OpenAI." In the early hours of 22 November, Altman returned as CEO, the board was reconstituted, and Sutskever stepped off it. The true cause of the storm — whether it was a real disagreement over safety, a fundamental governance failure, or a collapse of trust between a few people — has not been fully accounted for by any side.
After the storm, Sutskever almost vanished from view inside OpenAI. For half a year, he did not appear at any major product launch.
2024: To Pursue Safe Superintelligence
On 14 May 2024, Sutskever announced his departure from OpenAI on social media. The post was short and restrained — no accusations, no self-defence, only: "The company's trajectory has been nothing short of miraculous."
A month later, on 19 June 2024, he founded Safe Superintelligence Inc. (SSI) with his former OpenAI colleague Daniel Levy and the former Apple AI executive Daniel Gross. The company has only one product direction; its public statement is one sentence: "We will pursue safe superintelligence in a straight shot, with one focus, one goal, and one product." In September 2024, SSI raised one billion US dollars before shipping any model; in 2025 a subsequent round took its valuation past thirty billion.
Sutskever still rarely gives interviews. His collaborators describe his daily life: "He is like a monk. What he thinks about each day is not the quarterly OKRs but the thing that may, ten years from now, be smarter than us."
Selected Works
| Year | Work | Significance |
|---|---|---|
| 2012 | Krizhevsky, Sutskever & Hinton, "ImageNet Classification with Deep CNNs", NeurIPS | AlexNet; the dawn of deep-learning vision |
| 2014 | Sutskever, Vinyals & Le, "Sequence to Sequence Learning with Neural Networks", NeurIPS | Seq2Seq; foundation of modern neural translation and dialogue |
| 2018 | Radford, Narasimhan, Salimans & Sutskever, "Improving Language Understanding by Generative Pre-training" | GPT-1; established the pre-train / fine-tune paradigm |
| 2019 | Radford et al., "Language Models are Unsupervised Multitask Learners" | GPT-2; revealed zero-shot capabilities of scaled pre-training |
| 2020 | Brown, ..., Sutskever, ..., "Language Models are Few-Shot Learners", NeurIPS | GPT-3; turned prompting into a new interface |
Historian's Note
Historian's Note
Sutskever is the quietest central figure in the modern history of AI. He is no popular-science star, no celebrity founder; he hardly writes blogs and cares little for how he is cited. Yet at the three pivotal turns of deep learning — vision, sequence, language — he sits at the middle of the photograph: AlexNet was the work of teacher and fellow-students together; Seq2Seq was written down at Google Brain; the GPT line was guarded at OpenAI. Between his silence and his influence stands one of the rarest contrasts of this age. The story of those five days in November 2023 has yet to be told whole by any party — it may have been a real disagreement over safety, a real failure of governance, or simply that ancient, human collapse of judgement and trust. Whichever it was, Sutskever paid a price: he lost the company he had helped to found, and he lost a circle of close colleagues. Yet he did not stop. In a moment when he could still have chosen ease and power, he did the rarer thing: he started another company with a single goal, and named it "Safe Superintelligence." This is a man who writes his belief into the name of his company. Sima Qian would understand such a man — he does not live for today's applause; he lives for an hour that has not yet come.
Eyewitness Accounts
Call for contributions
If you have firsthand recollections of Sutskever or related testimony, please contribute on GitHub.
References
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." NeurIPS, 25.
- Sutskever, I., Vinyals, O., & Le, Q. V. (2014). "Sequence to Sequence Learning with Neural Networks." NeurIPS, 27.
- Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). "Improving Language Understanding by Generative Pre-training." OpenAI Technical Report.
- Brown, T., et al. (2020). "Language Models are Few-Shot Learners." NeurIPS, 33.
- OpenAI Board of Directors (2023, November 17). "OpenAI announces leadership transition." openai.com/blog
- Sutskever, I. (2023, November 19). Internal statement on the OpenAI board action, reproduced in The New York Times and The Information.
- Sutskever, I. (2024, May 14). Personal announcement on X (formerly Twitter) regarding departure from OpenAI.
- Safe Superintelligence Inc. (2024, June 19). "Safe Superintelligence Inc." Founding statement, ssi.inc.
- Cade Metz (2024). Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World. Dutton (revised edition).
- Time (2023). "Person of the Year runners-up: The OpenAI Five Days."