In the past 10 years, the best-performing artificial-intelligence systems — such as the speech recognisers on smartphones or Google’s latest automatic translator — have resulted from a technique called “deep learning.”
Deep learning is in fact a new name for an approach to artificial intelligence called neural networks, which have been going in and out of fashion for more than 70 years. Modelled loosely on the human brain, a neural net consists of thousands or even millions of simple processing nodes that are densely interconnected. Most of today’s neural nets are organised into layers of nodes, and they’re “feed-forward,” meaning that data moves through them in only one direction. An individual node might be connected to several nodes in the layer beneath it, from which it receives data, and several nodes in the layer above it, to which it sends data.
Neural networks were first proposed in 1944 by Warren McCullough and Walter Pitts, two University of Chicago researchers who moved to MIT in 1952 as founding members of what’s sometimes called the first cognitive science department. A major area of research in both neuroscience and computer science until 1969, a resurgence in the 1980s before falling into eclipse again in the first decade of the new century, neural networks has returned like gangbusters in the second, fuelled largely by the increased processing power of graphics chips.
Neural nets are a means of doing machine learning, in which a computer learns to perform some task by analysing training examples. Usually, the examples have been hand-labeled in advance. An object recognition system, for instance, might be fed thousands of labeled images of cars, houses, coffee cups, and so on, and it would find visual patterns in the images that consistently correlate with particular labels.
The recent resurgence in neural networks — the deep-learning revolution — comes courtesy of the computer-game industry. The complex imagery and rapid pace of today’s video games require hardware that can keep up, and the result has been the graphics processing unit (GPU), which packs thousands of relatively simple processing cores on a single chip. It didn’t take long for researchers to realise that the architecture of a GPU is remarkably like that of a neural net.
Modern GPUs enabled the one-layer networks of the 1960s and the two- to three-layer networks of the 1980s to blossom into the 10-, 15-, even 50-layer networks of today. That’s what the “deep” in “deep learning” refers to — the depth of the network’s layers. And currently, deep learning is responsible for the best-performing systems in almost every area of artificial-intelligence research.
The networks’ opacity is still unsettling to theorists, but there’s headway on that front, too. In addition to directing the Centre for Brains, Minds, and Machines (CBMM), Poggio leads the centre’s research program in Theoretical Frameworks for Intelligence. Recently, Poggio and his CBMM colleagues have released a three-part theoretical study of neural networks.
The first part, which was published last month in the International Journal of Automation and Computing, addresses the range of computations that deep-learning networks can execute and when deep networks offer advantages over shallower ones. Parts two and three, which have been released as CBMM technical reports, address the problems of global optimisation, or guaranteeing that a network has found the settings that best accord with its training data, and overfitting, or cases in which the network becomes so attuned to the specifics of its training data that it fails to generalise to other instances of the same categories.