March 18, 1999
Gary F. Marcus
Department of Psychology
New York University
In a recent article in Science (Marcus, Vijayan, Bandi Rao, and Vishton, 1999), my colleagues and I argued that infants could extract algebraic rules, and we noted that certain kinds of neural networks could not generalize those rules to novel words that did not overlap in their sounds. The kind of network that we had in mind was the simple recurrent network, in its standard application as a device to solve the "prediction task."
In our experiments, infants were trained on 16 sentences such as:
"le di di", "le je je", "le li li", "le we we",
"wi di di", "wi je je", "wi li li", "wi we we",
"ji di di", "ji je je", "ji li li", "ji we we",
"de di di", "de je je", "de li li", "de we we"
And then tested on two novel sentences that were consistent with the training sentences:
"ba po po", and "ko ga ga"
And two novel sentences that were inconsistent with the training sentences:
"ba po ba", "ko ga ko"
Infants looked longer at the sentence that were inconsistent with the test items than at sentences that were consistent with the test items.
What we did in our simulations was to test whether an SRN trained on the prediction task could predict what comes next in an ABB sequence. In one such simulation, we used thirteen localist input units (one for each word plus a punctuation marker), thirteen corresponding localist output units, forty hidden units and an additional 40 context units. What we found was that the model could correctly predict the continuations for those words on which it was trained (e.g., given "le je ___", the network could predict "le") but not for words on which it was not trained (e.g., given "ba po ___", the network predicted "li", with an activation of > 0.5, with the activation of ba < 0.01.)
We found essentially the same results using the distributed input representations used by Seidenberg and Elman in their simulation of our experiments. (A discussion of alternative versions of the simple recurrent network has been submitted to Science.)
Interested readers can replicate these two tests by downloading the tlearn network simulator from UCSD and downloading the network configuration files and network training files that we used, by clicking here.
Further discussion, including a chapter from Marcuss forthcoming book, discussion by Elman, and a response by Marcus, can be found at the web site for The Algebraic Mind.