Skip to content

House · University of Toronto / Vector Institute

Hinton kept the lone island of backpropagation alive here for twenty years, and here he lit the bonfire of 2012 that set the world ablaze. Toronto is, in the truest sense, the holy city of the deep-learning revolution.

Hinton Heads North

In 1987, Geoffrey Hinton (1947–) left Carnegie Mellon University in the United States and moved north to Canada to take a professorship in the Department of Computer Science at the University of Toronto. The immediate reason carried the political flavor of the Cold War: CMU's research funding leaned heavily on the Pentagon, and Hinton did not want his work pulled toward military projects. Toronto promised academic freedom and a stable post, and he packed up his family and went.

He was less than a year past co-authoring the landmark backpropagation paper with David Rumelhart (1942–2011) and Ronald Williams (1948–) in 1986. Only later would the true significance of the move become clear: across the next two decades, mainstream North American AI (expert systems, SVMs, Bayesian methods, statistical NLP) gave neural networks little serious thought, and Toronto was one of the few remaining strongholds for that line of work in the English-speaking world. Yann LeCun (1960–) spent 1987–1988 in Toronto as a postdoc, half student and half friend; Yoshua Bengio (1964–) also visited briefly. The friendship and intellectual lineage of the "Canadian three" were rooted here.

A Spark in the Winter

In the 1990s, neural networks entered their second winter. The rise of SVMs all but cleared the application space; reviewers were known to reject papers simply because the title contained "neural network." Hinton's student Brendan Frey would later recall those years as "being shut out of the mainstream"—"we had to call neural networks something else just to get the paper through."

Hinton did not give up in Toronto. He kept working on Restricted Boltzmann Machines (RBM), Contrastive Divergence (CD), and Deep Belief Networks (DBN)—research the times had "misread." He trained a generation of students and postdocs who would later shape all of deep learning:

  • Brendan Frey (graphical models and biomedical AI; later founded Deep Genomics)
  • Radford Neal (Bayesian neural networks and MCMC)
  • Yann LeCun (postdoc, 1987–1988 in Toronto)
  • Yee-Whye Teh (deep learning and Bayesian non-parametrics)
  • Ilya Sutskever (later co-founder and chief scientist of OpenAI)
  • Ruslan Salakhutdinov (later CMU professor and Apple's first AI director)
  • Alex Krizhevsky (first author of AlexNet)
  • George Dahl (co-author of the speech-recognition breakthrough; later at Google Brain)
  • Graham Taylor (University of Guelph; Vector Institute Canadian Scientific Director)
  • Ian Goodfellow (a brief visitor; closer to Bengio, but Hinton was a major influence)

Beginning in 2003, Hinton led the Canadian Institute for Advanced Research's (CIFAR) program on Neural Computation and Adaptive Perception (NCAP). The program funded about 10 million Canadian dollars a year and gathered, in small mountain towns in Canada, the few researchers worldwide who still believed in neural networks—including LeCun, Bengio, Andrew Ng, and Yann Dauphin. CIFAR's small "shelter" is now widely credited with keeping the deep-learning flame from being blown out in the winter.

2006: The Fuse of the Renaissance

On July 28, 2006, Science published a paper co-authored by Hinton and Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks. Earlier the same year, Hinton, Simon Osindero, and Teh had published A Fast Learning Algorithm for Deep Belief Nets in Neural Computation, introducing a two-stage method—"layer-wise pre-training plus fine-tuning"—that for the first time made deep neural networks effectively trainable.

These two papers are now widely treated as the fuse of the deep-learning renaissance. They were not just a technical breakthrough but a public stance: Hinton, on the pages of mainstream journals, told the world that neural networks still had a future. The phrase "deep learning" was used and reused by him and his collaborators—not "connectionism," not "neural networks"—deliberately to dodge the prejudices of the winter.

In 2009 Hinton, with PhD students George Dahl and Abdel-rahman Mohamed, applied deep neural networks to speech recognition and dropped the error rate dramatically on the TIMIT benchmark. In 2010 Li Deng of Microsoft Research invited Hinton to apply the work to industrial-scale data; the result was the historic 2011 drop in Microsoft's speech-recognition error rate—the first sign of industry's turn toward deep learning.

AlexNet, and That Summer

In the summer of 2012, in an ordinary graduate office in the University of Toronto's computer science department, Hinton's two PhD students Alex Krizhevsky (1986–) and Ilya Sutskever (1986–) were training a neural network on two NVIDIA GTX 580 GPUs. Krizhevsky himself wrote the CUDA kernels; the target was the ImageNet Large Scale Visual Recognition Challenge (ILSVRC-2012).

On October 13 the results came in: AlexNet's top-5 error rate on ImageNet was 15.3%, a full 10.9 points below the second-place team (built on traditional computer-vision feature engineering). The vision community went into uproar—nothing had pulled away by such a margin in the previous decade.

At NeurIPS 2012 in December at Lake Tahoe, Hinton appeared in his trademark wool vest as the three co-presented AlexNet. Researchers who walked out of that hall remembered, "By the end of the meeting, we all knew the rules of the game had changed."

In early 2013 Hinton, Sutskever, and Krizhevsky founded a company called DNNresearch. In March, Google bought it at auction for about 44 million dollars—a company of three people with no product. From then on Hinton split his time "one-quarter at Google, three-quarters in Toronto." The deal also kicked off Silicon Valley's frenzied competition for deep-learning talent: Facebook brought in LeCun, Baidu poached Andrew Ng with an extraordinary salary, and Google bought DeepMind for 600 million dollars.

The Vector Institute Stands Up

On March 30, 2017, the Vector Institute for Artificial Intelligence opened in Toronto's MaRS Discovery District. The Ontario provincial government contributed 50 million Canadian dollars, the federal government's "Pan-Canadian AI Strategy" contributed 40 million, and more than thirty founding companies (including Google, NVIDIA, Canada's five largest banks, Shopify, Thomson Reuters, and Magna) contributed 80 million—a total of about 170 million Canadian dollars.

Hinton became Chief Scientific Advisor; Richard Zemel of Toronto's CS department was Director of Research; faculty from Waterloo's Pascal Poupart and McMaster joined in. Vector is not part of any single university but an independent non-profit research institute, with cross-appointments to Toronto, Waterloo, Guelph, and others.

The aim of founding Vector was to keep Canadian-trained AI talent from migrating en masse to Silicon Valley. Raquel Urtasun (autonomous driving; later founded Waabi), Jimmy Ba (a co-author of the Adam optimizer), Sanja Fidler (computer vision; jointly appointed with NVIDIA), David Duvenaud (Bayesian deep learning), and Roger Grosse (neural-network optimization) stayed in Toronto. By 2025, Vector had about 130 faculty researchers and nearly 1,000 students and postdocs.

Vector is one corner of Canada's "AI iron triangle," along with the Montreal MILA led by Yoshua Bengio (1964–) and the Edmonton Amii (Alberta Machine Intelligence Institute) led by Richard Sutton and Michael Bowling. The three labor in concert under a unified national strategy.

The Nobel Prize and a Late-Career Stance

In May 2023, Hinton announced his resignation from Google. He went on, in dense succession, to The New York Times, the BBC, and CBS's 60 Minutes, warning of risks from generative AI—from disinformation and labor disruption all the way to a long-held conviction of his that "digital intelligence will, in the end, surpass biological intelligence." That stance set him visibly apart from a few peers (notably LeCun) and made him, in academia, the most credentialed voice of the "AI risk" school.

On October 8, 2024, the Royal Swedish Academy of Sciences announced that the year's Nobel Prize in Physics would go to John Hopfield and Geoffrey Hinton, "for foundational discoveries and inventions that enable machine learning with artificial neural networks." On the call connecting him to the announcement, Hinton said he had "not at all expected this." The AI community took it as a formal coronation of decades of difficult work.

Hinton's late-career stance—Nobel laureate and AI risk whistle-blower at the same time—completes the arc of Toronto's storyline: from a marginalized connectionist in the 1980s, to chief architect of the 2010s deep-learning revolution, to a public reflection in the 2020s on the possible consequences of his own work. The University of Toronto thus stands among the very few "one city, one master" legends in AI history—a single university, a single person, a single line carrying a whole stretch of global tech history.

Toronto AI as an Institution

Today the University of Toronto's Department of Computer Science and the Vector Institute together form a rare two-track structure: the university provides long-term undergraduate and graduate training and basic research; Vector contributes cross-institutional postdocs, industry partnerships, and compute infrastructure. Across King's College Circle and MaRS, less than 500 meters apart, they look out at each other.

Toronto's student lineage continues to send out top AI researchers. Beyond Sutskever, Krizhevsky, and Salakhutdinov, recent names include Nitish Srivastava (first author of Dropout; Apple), Mohammad Norouzi (Google Brain; key contributor to diffusion models), Tijmen Tieleman (proposed RMSprop), Jamie Kiros (skip-thought vectors), Aidan Gomez (co-founder and CEO of Cohere), and Ivan Zhang (co-founder of Cohere). Cohere itself, founded in 2019 by Toronto students, was valued at over 5 billion dollars in 2024—the most visible example of the startup ecosystem around Vector.

By 2026, with the resources of a mid-sized public university, Toronto has contributed to AI history on the scale of MIT, Stanford, and CMU. This is not a victory of budgets or scale; it is a victory of academic taste—betting on a neglected direction and waiting twenty years for it to bloom.

Historian's Note

Toronto AI's story is the story of a person and a city completing each other. Hinton was already 39 when he moved north in 1987, a mid-career scholar pushed to the margins of the mainstream; Toronto took him in and gave him twenty years of academic freedom—and he, in return, used AlexNet in 2012 and the 2024 Nobel Prize in Physics to write the city into AI history forever. Inside that story is a plain truth that gets ignored again and again: real scientific breakthroughs are rarely "fast and short" projects but a researcher with two or three students walking down a road no one else respects, for twenty years. Today, governments and large companies set up "AI centers" and "AI institutes" with budgets in the billions; whether any of them will reproduce the Toronto model depends not on money but on whether some researcher is still allowed, under the ridicule of the mainstream, to sit on a cold bench for twenty years. Toronto's gift to the world is not just AlexNet and the Vector Institute; it is the academic patience to bet long.

Eyewitness Accounts

Call for contributions

If you have worked or studied at the University of Toronto's AI groups, the CIFAR NCAP program, or the Vector Institute, please contribute on GitHub.

References

  1. Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). "A Fast Learning Algorithm for Deep Belief Nets." Neural Computation, 18(7): 1527–1554.
  2. Hinton, G. E., & Salakhutdinov, R. R. (2006). "Reducing the Dimensionality of Data with Neural Networks." Science, 313(5786): 504–507.
  3. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). "ImageNet Classification with Deep Convolutional Neural Networks." NeurIPS 2012.
  4. Hinton, G. E., Deng, L., Yu, D., et al. (2012). "Deep Neural Networks for Acoustic Modeling in Speech Recognition." IEEE Signal Processing Magazine, 29(6).
  5. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). "Learning Representations by Back-Propagating Errors." Nature, 323(6088): 533–536.
  6. CIFAR. Pan-Canadian Artificial Intelligence Strategy and NCAP Program Reports (2004–2017).
  7. Vector Institute. Annual Reports 2017–2024. https://vectorinstitute.ai/
  8. The Royal Swedish Academy of Sciences (2024). "The Nobel Prize in Physics 2024 — Press Release." https://www.nobelprize.org/prizes/physics/2024/press-release/
  9. Metz, C. (2021). Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World. Dutton.
  10. Markoff, J. (2013, March 12). "Google Adds to Its Menagerie of Machines." The New York Times.
  11. Sejnowski, T. J. (2018). The Deep Learning Revolution. MIT Press.
  12. Hinton, G. E. (2023, May 1). Interviews with The New York Times and the BBC on leaving Google.

Released under CC-BY-SA 4.0