The Algebraic Mind

© 1998 by Gary F. Marcus. All rights reserved

Draft: Do Not Quote Without Permission

 

Introduction

 

A. The Relation Between Connectionism and Cognitive Science

What is a mind such that it can entertain an infinity of thoughts and a thought such that a mind can entertain it? One theory holds that the mind is a manipulator of symbols and rules, an idea that forms the backbone of generative linguistics and classical artificial intelligence. Lately, this theory of symbol-manipulation has come under attack. The impetus of this attack is a branch of the field known as connectionism.

This book is a meditation on the relation between connectionism and cognitive science, an attempt to see what connectionism can tell cognitive science and what cognitive science can tell connectionism.

Connectionist models are simulations of the mind, composed of large numbers of interconnected simple units operating largely in parallel. These simulations allow researchers to test different ways of putting together simulated brain circuits.

This sort of research can play two valuable roles: on the one hand, it can constrain neuroscience from above, suggesting the kinds of circuits that neuroscientists ought to be seeking. On the other hand, it can constrain cognitive science from below, suggesting what kinds of brain assemblies might plausibly underlie higher level cognition.

Connectionism, which dates back at least to a classic paper by McCulloch and Pitts , reached an initial peak of interest in the 1950s and 1960s. Enthusiasm waned after the publication of Minsky and Papert’s influential book Perceptrons, which pointed out serious limitations on an early class of popular models. An influential collection of articles in 1981 signaled new enthusiasm for connectionism, culminating in the publication in 1986 of the landmark two volume set Parallel Distributed Processing (PDP), which set the stage for connectionism’s current popularity. (For a brief history of connectionism, see Bechtel and Abrahamsen, 1991; reprints of several of the most important articles in the history of connectionism can be found in Anderson and Rosenfeld, 1988).

One reason that connectionism is so popular now is that it seems "neurally-inspired" . Some connectionist researchers for example aim to "understand the sorts of computations that can plausibly be carried out in neural systems" -- clearly an important goal as cognitive neuroscience begins to take shape.

In what follows, I will assume that some version of connectionism must be correct, that there will ultimately be some empirically adequate way of modeling the operation of the mind in a connectionist substrate. The question here will thus not be whether we can build, say, an adequate connectionist model of English past tense formation (I assume the answer is yes), but rather, what architectural properties must adequate models have?

Much of the discussion in this monograph will contrast two different approaches to connectionism, which Pinker and Prince called "eliminative connectionism" and, "implementational connectionism". At stake is whether our cognitive life is best described in terms of symbol-manipulation. The view that the mind manipulates symbols holds first of all that there are symbols -- encodings of entities such as words, concepts, or ideas -- and second that these symbols can serve as elements in a variety of computations. Implementational connectionists use connectionism as a tool for trying to understand how symbol-manipulating processes could be implemented in the brain {Holyoak, 19xx #490; Holyoak, to appear #366; Barnden, 1992 #471; Hinton, 1990 #152; Touretzky, 1985 #135}.

In contrast, eliminative connectionists hold that symbols and symbol-manipulating mechanisms are superfluous or at best restricted to narrow domains of cognition such as the "conscious rule use" that we use to explicitly describe the movement of a rook in the game of chess. This more radical form of a connectionism is often seen as representing a "paradigm shift" ; and it is this more radical branch of connectionism that most people have in mind when they talk about connectionism.

Which of these strategies -- "eliminative connectionism" that seeks to eliminate symbol-manipulation or "implementational connectionism’ that seeks to find ways of implementing symbol-manipulation in a connectionist substrate -- is likely to be most profitable?

Unfortunately, there is as yet no systematic theory of how to choose which sort of connectionist architecture one should use for a given domain of cognition. (There are some formal proofs about what broad classes of models can do, including a substantial literature about how to build connectionist Turing machines , and there even exist proofs that certain classes of models are universal function approximators. But such proofs pertain to models that may have little to do with psychology; less is known about the specific abilities and limitations of the kinds of models used in connectionist simulations of psychology; for further discussion of these proofs, see Chapter 2.)

Going hand in hand with lack of systematic theories about what models can and cannot do and how to build them is a lack of criteria for evaluating connectionist models. What makes a good model? What makes a bad model? If a model does one thing well but cannot account for another thing at all, is it a good model or a bad model? What, if anything, would count as evidence for a given model? Against a given model?

B. Preview

What I aim to do in this monograph is to develop a more systematic account of what is needed to build an adequate connectionist model of cognition, by providing a kind of computational lower bound on what computational elements are a necessary part of language and higher-level cognition.

My focus will be on connectionist models of language and cognition, rather than on connectionist models of perception and action. My focus is on language and cognition because these are the domains that are most often described in terms of symbol-manipulation. If symbol-manipulation played no role in these domains, it is unlikely that symbol-manipulation would play a role in other domains. Of course, it is quite possible that the elements that subserve cognition and language differ from the elements that subserve some aspects of perception and action. Symbol-manipulation might play a role in cognition and language, but not in perception and action.

To preview, I will propose that --- contra the goals of eliminative connectionism -- the following four elements are among those that are fundamental to higher level cognition and language:

• Symbols, to be explained Chapter 4, are entities that encode all members of an equivalence class in a common way . Symbols can represent either individuals such as Donald Duck or categories such as duck or cartoon character. Symbols can refer to atomic elements or complex combinations. For instance, the syntactic role of noun phrase can be satisfied by a single word (novels) or a more complex phrase (the tattered but well-read Mark Twain novels). From the perspective of many syntactic rules, these noun phrases are equal -- and thus can occur in the same contexts -- because each bears the symbol noun phrase.

• Rules that describe relationships between placeholders that I will call abstract variables (Chapter 5). These entities allow us to learn and represent relationships that hold for all members of some class, allowing us to express generalizations compactly . Rather than specifying individually that Daffy likes to swim, Donald likes to swim, and so forth, we can describe a generalization that does not make reference to any specific duck, thereby using the type duck as an implicit variable. In this way, variables act as placeholders for arbitrary members of a category. Another key benefit of rules and variables is that they allow us to extend generalizations to novel instances of categories, whether those members are familiar or new. For instance, a name can appear as a subject, an object, or an indirect object. Hearing a name in any of these positions thus licenses its occurrence in the other positions: the grammaticality of Bftsplk loved Mary entails the grammaticality of Mary loves Bftsplk and Mary gave flowers to Bftsplk.

• Ways of representing and building complex, structured units out of simple units, and rules and representational formats that permit those complex units to serve as elements in still more complex units (Chapter 6). We can pick out a cup, a cup that is on the table, a cup that is on the table that is in the dining room, and so forth.

• An internally-represented distinction between types (classes) and tokens (instances of those classes) . For instance, Daffy is an instance of the type duck; at the same time, the type bird is not equivalent to the instance Donald (or the set of all actual ducks). The distinction between tokens and types makes it natural to trace the identity of individuals through time. In Chapter 7 I will argue that we have a way of presenting this distinction internally.

My suggestion is not that connectionism cannot implement these elements, but precisely the opposite: connectionism can implement these elements -- and my view is that figuring out the best way to do so should be connectionism’s top priority. My argument calls not for the elimination of connectionism, but for a reshaping of its agenda.

To read an excerpt from the chapter on rules and variables, click here.

Front PageTable of Contents | Excerpts | Discussion | About the author | MIT Press