Biography · Geoffrey Hinton
While others chased the tides of the academy, he kept watch over a stretch of marshland no one else valued — and forty years later the sea rose at his feet.

A Child of a Mathematical Family
Geoffrey Hinton (1947–) was born in Wimbledon, England, into a family of names that turn heads. His great-great-grandfather was George Boole, one of the founders of modern logic — every "true / false" in modern code still echoes that Victorian mathematician's name. His great-great-uncle was Charles Howard Hinton, the nineteenth-century thinker who coined the word "tesseract" for the four-dimensional hypercube. His cousin Joan Hinton was a nuclear physicist who worked on the Manhattan Project before moving to China to raise dairy cows. His father Howard Hinton was an entomologist who specialised in beetles.
In a household like that, scholarship was almost the default. But the path Geoffrey took was not smooth. At King's College, Cambridge he switched from physics to philosophy, then to psychology, finally taking his bachelor's degree in Experimental Psychology in 1970. He later said one question had haunted him from the start: how does the brain actually work? No discipline had an answer — physics was too abstract, philosophy too airy, psychology only described behaviour without explaining the mechanism. So he decided to look for the answer himself.
In 1972 he entered the University of Edinburgh as a doctoral student in Artificial Intelligence under Christopher Longuet-Higgins, a chemist who had only recently turned to neural-network models. It was the high noon of symbolic AI; almost no one in British AI took seriously the idea of building intelligence out of small neuron-like units. His supervisor advised him to pick a more respectable subject. Hinton refused. In 1978 he received his PhD with a thesis titled Relaxation and Its Role in Vision — a textbook piece of connectionism.
The Boltzmann Machine and a Strange Wager
After his doctorate Hinton went to the University of California, San Diego as a postdoc, joining David Rumelhart's Parallel Distributed Processing (PDP) group, the spiritual headquarters of 1980s connectionism. There he met two collaborators who would shape the rest of his life: the psychologist Rumelhart and the neuroscientist Terry Sejnowski.
In 1983, Hinton and Sejnowski proposed the Boltzmann Machine — a network of stochastic binary neurons whose behaviour was described in terms of an energy function borrowed from statistical physics. It was the first time Hinton imported the language of thermodynamics into neural networks: the network had a "temperature" and an "energy"; learning meant lowering that energy and reaching an equilibrium distribution. This bridge between artificial neural networks and statistical physics would, four decades later, become the explicit reason for his Nobel Prize in Physics.
In 1986, with Rumelhart and Ronald Williams (1948–) as co-authors, he published "Learning Representations by Back-Propagating Errors" in Nature. Strictly speaking, the paper was not the first statement of backpropagation: Paul Werbos (1947–) had written down an equivalent algorithm in his 1974 Harvard PhD thesis, and the chain rule lay even deeper in 1960s control theory. What this paper did, that no one had done before, was to show with a clean set of experiments that backpropagation could learn meaningful internal representations in multi-layer networks. Family-tree networks, parity, symmetry recognition — one small problem after another fell. The writing was crisp and forceful, and many readers learned for the first time what backpropagation actually was.
He was thirty-nine, already a professor at Carnegie Mellon University. But he disliked the heavy dependence of CMU's AI research on Pentagon funding. In 1987 he moved his family north to the University of Toronto — somewhere quieter, where he would not have to serve the military. He later joked that he had refused to help build "killing machines" then, only to be worrying forty years on about whether machines should be allowed to kill at all.
A Class No One Came To
The next dozen years in Toronto were not easy. In the 1990s the rise of the support vector machine and Vladimir Vapnik (1936–)'s statistical learning theory marched in triumph through industry and academia. Neural networks were dismissed as a black box, an art of parameter twiddling, theoretically unserious. Reviewers rejected papers on sight if "neural network" appeared in the title; Hinton's own students renamed their networks under different labels just to get published.
He kept teaching, kept supervising, kept pursuing the research everyone laughed at. He worked on Restricted Boltzmann Machines (RBMs), Contrastive Divergence (CD), Deep Belief Networks (DBNs) — directions on which only he and a few stubborn allies still spent their time. In 2003 he became the director of the Canadian Institute for Advanced Research's Neural Computation and Adaptive Perception (NCAP) programme, with about ten million Canadian dollars a year, gathering the world's few remaining believers — Yann LeCun (1960–), Yoshua Bengio (1964–), Andrew Ng — for regular meetings in small Canadian mountain towns. They jokingly called themselves a club for the unfashionable.
There is a detail strangers seldom know. From his thirties Hinton has suffered from a severe disc problem in his back, and cannot sit for long. He often listens to talks standing up, and writes papers with his laptop perched on a bookshelf. There is a tall standing desk built specifically for him in his Toronto office. A marginalised school of thought, a scholar who could not sit — together they endured twenty Canadian winters.
2006 and 2012: Two Ignitions
In 2006, Hinton and his doctoral student Ruslan Salakhutdinov published "Reducing the Dimensionality of Data with Neural Networks" in Science; the same year, with Simon Osindero and Yee-Whye Teh, he published "A Fast Learning Algorithm for Deep Belief Nets" in Neural Computation. The two papers introduced layer-wise pre-training and fine-tuning, the first time deep neural networks could be trained effectively. The very phrase "deep learning" began to be deliberately propagated by Hinton and his circle in this period — a conscious effort to escape the prejudices left over from the long winter that had clung to "connectionism" and "neural networks".
The first ignition did not immediately become a wildfire. The true tipping point came in 2012. That summer his two PhD students, Alex Krizhevsky (1986–) and Ilya Sutskever (1986–), used two NVIDIA GTX 580 graphics cards to train an eight-layer convolutional network — later known as AlexNet. On 13 October the ImageNet results were announced: AlexNet had cut the top-5 error rate from the previous year's 26.2% to 15.3%. Overnight, the entire computer-vision community changed its religion. In early 2013 Hinton, Sutskever and Krizhevsky founded a tiny company called DNNresearch; in March, Google bought it for roughly 44 million US dollars — a company of three people with no product. From then on, Hinton worked with one quarter of his time at Google and three quarters at the University of Toronto.
He later said the success of deep learning had not surprised him: "I always knew it would work. I just didn't expect to wait thirty years."
The Spring He Left Google
On 1 May 2023, Hinton announced in The New York Times that he was leaving Google. The reason was not friction with the company — he repeatedly emphasised that Google had behaved "very responsibly" on AI safety — but his wish to "speak freely about the dangers of AI."
For decades he had believed that digital neural networks and biological brains were two implementations of the same kind of intelligence and would eventually converge. The arrival of GPT-4 changed his judgement. Digital intelligence might not only catch up with biological intelligence but might overtake it far faster, because it could be replicated without limit, learn in parallel, and synchronise its knowledge instantaneously. In interview after interview he repeated a single sentence: "I now believe digital intelligence will surpass biological intelligence; I am not sure we can pass through that transition safely."
Many wondered why a lifelong believer would turn at the height of his fame. His explanation was plain: as a scientist, when the evidence changes, the conclusion must change. His age had stripped him of professional self-interest, and he could speak the truth. He was seventy-six. There was no time left to defer it.
A Nobel Prize and the Echo of an Apple
On 8 October 2024, the Royal Swedish Academy of Sciences announced the Nobel Prize in Physics for that year, awarded jointly to John Hopfield and Hinton, "for foundational discoveries and inventions that enable machine learning with artificial neural networks." On the live announcement call, Hinton's first sentence was, "I had absolutely no expectation of this." Asked what he most wanted to say, he replied: "I wish I could tell my supervisor Christopher Longuet-Higgins, who told me back then to stop working on neural networks."
It was the first time the Nobel Prize had formally counted artificial neural networks as part of physics. The citation singled out the Boltzmann Machine — his work with Sejnowski in 1983. What earned him the prize was not the AlexNet that astonished the world in 2012, but the lonely path he had walked in the 1980s, on a cold bench almost no one else was willing to share.
His Nobel lecture in Stockholm was titled Boltzmann Machines. Among the audience were physicists, chemists, doctors and diplomats, many of whom knew nothing of his subject. He began with free-energy functions and energy landscapes, and turned forty years of solitary work by a child of a mathematical family into a story about stitching statistical physics into the study of the mind.
Selected Works
| Year | Work | Significance |
|---|---|---|
| 1983 | Hinton & Sejnowski, "Optimal Perceptual Inference" | The Boltzmann Machine; introducing a statistical-physics view |
| 1986 | Rumelhart, Hinton & Williams, "Learning Representations by Back-Propagating Errors", Nature | Turned a forgotten algorithm into the engine of deep learning |
| 2006 | Hinton, Osindero & Teh, "A Fast Learning Algorithm for Deep Belief Nets", Neural Computation | Layer-wise pre-training; the kindling of the deep-learning revival |
| 2006 | Hinton & Salakhutdinov, "Reducing the Dimensionality of Data with Neural Networks", Science | Brought neural networks back into mainstream journals |
| 2012 | Krizhevsky, Sutskever & Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", NeurIPS | AlexNet; the dawn of the deep-learning era |
| 2015 | Hinton, Vinyals & Dean, "Distilling the Knowledge in a Neural Network" | Knowledge distillation; one paradigm of model compression |
| 2017 | Sabour, Frosst & Hinton, "Dynamic Routing Between Capsules" | Capsule networks; a reflection on the limits of CNNs |
Historian's Note
Historian's Note
Common opinion ascribes the rise of deep learning to the bounty of the GPU, the fortune of vast data, and the largesse of Google. These are but outward causes. The inward cause is that Hinton stood alone on one line for thirty years, sitting on a cold bench while the mainstream laughed. From the backpropagation paper of 1986 to the deep belief networks of 2006 and the AlexNet of 2012, the intervals — twenty years, then six — are not the waiting of technology but the waiting of a scholar's will. The hardship of learning lies not in some momentary cleverness but in long years of stubbornness. In half a century of research, Hinton failed far more often than he succeeded; his papers were rejected, his work ignored, his colleagues told him to change fields, and he refused. By his seventieth year, the tide had reversed and drowned all those who had once mocked him. Yet at the moment of his greatest vindication he did not grow complacent; he stepped forth in 2023 to warn that the fire he himself had lit might burn the forest down. This is a rare scholarly virtue: to bear two decades of obscurity and to keep one's head clear at the very pinnacle. Boole bequeathed logic to his descendants; Hinton has bequeathed learning. Twice a single bloodline has set boundary stones in the history of human thought. The coincidence is no coincidence: in both generations, the impulse was the same — to treat the mind as something that can be precisely described, and to spend a lifetime on a wager that everyone else thinks unserious.
Eyewitness Accounts
Call for contributions
If you worked alongside Hinton at CMU, in the Toronto group, in the CIFAR NCAP programme, or at Google Brain, please contribute on GitHub.
References
- Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). "Learning Representations by Back-Propagating Errors." Nature, 323(6088), 533–536.
- Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). "A Learning Algorithm for Boltzmann Machines." Cognitive Science, 9(1), 147–169.
- Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). "A Fast Learning Algorithm for Deep Belief Nets." Neural Computation, 18(7), 1527–1554.
- Hinton, G. E., & Salakhutdinov, R. R. (2006). "Reducing the Dimensionality of Data with Neural Networks." Science, 313(5786), 504–507.
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." NeurIPS 2012.
- Metz, C. (2021). Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World. Dutton.
- Sejnowski, T. J. (2018). The Deep Learning Revolution. MIT Press.
- Metz, C. (2023, May 1). "The Godfather of A.I. Leaves Google and Warns of Danger Ahead." The New York Times.
- The Royal Swedish Academy of Sciences (2024). "The Nobel Prize in Physics 2024 – Press Release & Scientific Background." https://www.nobelprize.org/prizes/physics/2024/
- Hinton, G. E. (2024). "Boltzmann Machines." Nobel Lecture, Stockholm.