Building Machines that Learn and Think Like People (pt 1. Introduction and History)23 Dec 2017 | review
tags: cognitive-science, machine-learning, brain, deep-learning Note: Ideas and opinions that are my own and not of the article will be in an italicized grey.
Series Table of ContentsPart 1: Introduction and History
Part 2: Challenges for Building Human-Like Machines
Part 3: Developmental Software
Part 4: Learning as Rapid Model-Building
Part 5: Thinking Fast
|Article Table of Contents|
|Motivation for series|
|History of Brain-Inspiration in AI|
Motivation for series
(Feel free to skip)
This is part 1 in a series of blog posts, where I plan to summarize the fascinating (but lengthy) Building Machines that Learn and Think Like People by Lake et al. This paper discusses how current deep learning models (glossary), despite their success and common comparison to the brain, do not learn how brains do in many respects. The authors offer a set of “key ingredients” to endow neural networks with what might allow them to learn and think more like brains do.
I’ve wanted to read this paper for some time. One of my central goals as an aspiring brain and machine learning researcher is to build human-inspired AI. As I’m very junior in the field, I thought this paper would give me a lot of insight into how to go about doing that. I was finally pushed into reading it when I discovered that along with this paper, the Journal for Behavioral and Brain Sciences has published 27 promising commentaries! Among the ones I’m most excited to read next are:
- Building machines that learn and think for themselves by DeepMind
- Building on prior knowledge without building it in by McClelland et al.
- Ingredients of intelligence: From classic debates to an engineering roadmap, a meta-response by Lake et al.
I encourage people to discuss ideas and ask questions in the comments section. A lot of research is coming out in cognitive science, neuroscience, artificial intelligence, and their intersection, and I would love for this to turn into a dialogue on these topics!
The purpose of this series is to highlight the challenges with building machines that learn and think like people. As such, I will skip aspects of the paper that generally review deep learning. Please feel free to read the paper for that material. The key idea: thanks to tremendous skill in pattern recognition, deep neural networks have achieved state-of-the-art performance in numerous domains including
- computer vision (e.g. learning to detect objects in images with complex scenes (Krizhevsky et al., 2012))
- speech modeling (e.g. learning to produce human-like speech (Oord et al., 2016)), and
- complex control problems (e.g. learning to play a Atari video-games without embedded knowledge of the video-game structure (Mnih et al., 2015)).
While neural networks perform very well on many tasks, they have limitations. For example, they often must be trained on tremendous quantities of data. Additionally, they are not know to generalize knowledge well to different tasks. This is in part because they (at least, in their current form) rely on statistical pattern recognition–they essentially learn to notice patterns through thousands to millions of examples. An alternative, which (Lake et al., 2016) suggest is a key ingredient of human learning, is a model-building approach. They argue that intelligent cognition relies on building and using causal models (glossary) to understand, explain, simulate, and predict the world. Despite this contrast, these two methods are certainly not orthogonal and machines can have a synergistic benefit.
The authors maintain that while they are critical of neural networks, they see them as somewhat fundamental to human-like learning machines. This is partly because any computational model for human learning must ultimately be grounded in the brain’s biological neural networks. However, the authors believe that future generations of neural networks will look very different from current state-of-the-art.I support this. The neural networks we use are crude abstractions of our currently incomplete and incorrect models for biological neural networks. For example, neuroscientists (and especially AI researchers) have long modeled neurons as single excitable units. Whether a neuron fires was a function of the electric signal that it received from its dendrites. For more on this perspective, see this introduction. However, physicists have recently found that neurons are not single excitable units but a collection of excitable units (Sardi et al., 2017). Further, each excitable unit is sensitive to the directionality of the origin of the input signal (i.e. the direction of the attached dendrite). This will potentially require a dramatic reformulation of artificial neural networks and will likely spur much research.
The main contribution of this paper is its suggestion of “key ingredients” for building machines that learn and think like people. Defining and motivating these ingredients makes up a majority of the paper, so I will make each broad category its own article in this series:
- “Developmental Software”: intuitive theories for the world that we learn at an early age such as intuitive theories for physics and psychology (e.g., with physics, we quickly learn that solid objects cannot go through eachother),
- “Model Building”: the ability to build causal models of the world via methods such as compositionality (glossary) and learning-to-learn (glossary), and
- “Thinking quickly”: the ability to quickly do inference (glossary) and prediction by combining model-free and model-based algorithms (glossary).
History of Brain-Inspiration in AI
Scientists such as Alan Turing have long thought that AI could be informative to or descriptive of cognition (Turing, 1950). In fact, Turing held a behaviorist view of learning reminiscent to a popular modern view that almost everything can be learning from the statistical patterns of sensory inputs.
Cognitive scientists repudiated this view of cognition and instead assumed that human knowledge representation was symbolic (glossary) in nature. They argued that many functions of cognition such as language and planning could be understood in terms of symbolic operations. This falls in line more with a “model-based” approach as you use an explicitly structured representation.
Somewhat complementary to both, another school of thought - and what would become the basis for deep learning - believed in sub-symbolic (glossary) distributed representations (glossary) of knowledge produced by parallel distributed processing (PDP) systems (Rumelhart & McClelland, 1986). Proponents of this view argued that many classic symbolic forms of knowledge such as graphs and grammars (production rules for strings) were useful but misleading for characterizing thought. Even if they were manifest, they were more likely emergent epiphenomena than fundamental in their own right (McClelland et al., 2010).
Researchers of PDP and neural networks showed that this method of distributed representation learning could, with minimal constraints and inductive biases (glossary), learn structured knowledge representations given enough data. They have shown that models could be trained to emulate the rule-like and structured behaviors that characterize cognition (Mnih et al., 2015). In recent history - perhaps more strikingly - researchers have found that the representations learned by artificial neural networks can predict the neural response patterns in the human and macaque cortex (Yamins et al., 2013). That is, representations learned by generic neural networks seem to align with primate representations.
Modern neural networks fed large amounts of data for pattern recognition tasks have been shown to learn representations reminiscent of those learned or used by humans. But how far towards truly human-like learning and thinking can we go by simply feeding large amounts of data to generic neural networks?
- Sardi, S., Vardi, R., Sheinin, A., Goldental, A., & Kanter, I. (2017). New Types of Experiments Reveal that a Neuron Functions as Multiple Independent Threshold Units. Sci. Rep., 7(1), 18036.
- Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2016). Building Machines That Learn and Think Like People. The Behavioral and Brain Sciences, 40, 1–101.
- Botvinick, M., Barrett, D. G. T., Battaglia, P., de Freitas, N., Kumaran, D., Leibo, J. Z., Lillicrap, T., Modayil, J., Mohamed, S., Rabinowitz, N. C., Rezende, D. J., Santoro, A., Schaul, T., Summerfield, C., Wayne, G., Weber, T., Wierstra, D., Legg, S., & Hassabis, D. Building machines that learn and think for themselves. Behavioral and Brain Sciences, 40.
- Hansen, S. S., Lampinen, A. K., Suri, G., & McClelland, J. L. Building on prior knowledge without building it in. Behavioral and Brain Sciences, 40. https://doi.org/10.1017/S0140525X17000176
- Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. Ingredients of intelligence: From classic debates to an engineering roadmap. Behavioral and Brain Sciences, 40. https://doi.org/10.1017/S0140525X17001224
- Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet Classification with Deep Convolutional Neural Networks. NIPS.
- Chung, Junyoung, Kastner, Kyle, Dinh, Laurent, Goel, Kratarth, Courville, Aaron, & Bengio, Yoshua. (2016). A Recurrent Latent Variable Model for Sequential Data. ArXiv.org.
- Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433–460.
- Rumelhart, D. E., & McClelland, J. L. (1986). Parallel Distributed Processing. MIT Press.
- McClelland, J. L., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S., & Smith, L. B. (2010). Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14(8), 348–356.
- Yamins, D. L., Hong, H., Cadieu, C., & DiCarlo, J. J. (2013). Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream. 3093–3101.
- Oord, A. van den, Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. ArXiv Preprint ArXiv:1609.03499.