Christiansen, M.H. & Curtin, S.L. (1999a). Transfer of learning: Rule acquisition or statistical learning? Trends in Cognitive Sciences, 3, 289-290.

Transfer of learning: rule acquisition or statistical learning?



Morten H. Christiansen, & Suzanne Curtin

Thirty years ago, Arthur Reber demonstrated that adults show evidence of transfer of learning in artificial-language experiments, in which the surface vocabulary is changed between training and test items (1). In a series of experiments, Marcus and colleagues (2) demonstrated that infants as young as seven months old also show evidence of transfer of learning, but incorrectly conclude that the infants were extracting abstract algebraic rules rather than encoding statistical regularities. In contrast, a recent comprehensive review of the artificiallanguage-learning literature has demonstrated that transfer does not entail the involvement of abstract rules (3).

In Marcus et al.’s most persuasive demonstration of transfer of learning (Experiment 3 in Ref. 2), the infants were first trained on syllable sequences that followed either an AAB or ABB pattern (e.g. ‘le-le-je’ versus ‘le-je-je’). The infants were then presented with sequences of novel syllables, either consistent or inconsistent with the training pattern. The infants showed a preference for the inconsistent items, thus demonstrating transfer between the different syllable vocabularies used in habituation and testing. Because there was no phonological overlap between training items and test items, Marcus et al. concluded that a statistical learning device could not account for these transfer results without implementing algebraic-like rules (see also the responses by Marcus (4,5) to commentaries by Seidenberg and Elman (6) and by McClelland and Plaut (7). However, we suggest that statistical knowledge acquired in the service of learning to segment fluent speech into words might provide the basis for these transfer effects in much the same way as knowledge acquired in the process of learning to read can be used to perform experimental tasks such as lexical decision.

Using an existing simple recurrentnetwork (8) model of early infant speech segmentation (9) (Fig. 1), we tested this suggestion and successfully modeled the Marcus et al. results (10) Importantly, no modifications were made to the original model, which learned to segment speech by integrating different kinds of probabilistic information derived from the speech stream (phonology, lexical stress and utterance-boundary information). Moreover, the simulation closely replicated the experimental conditions during both habituation and testing. The internal representations of the model were recorded at the end of each test item, and submitted to a two-group discriminant analysis. The results showed that these internal representations incorporated sufficient information to distinguish reliably between items that were either consistent or inconsistent with the habituation stimuli. Further analyses of the model’s word-segmentation performance revealed that the model was better at segmenting out the words in the inconsistent items. This would make the inconsistent items more salient and therefore explain why the infants preferred these to the consistent items. Thus the transfer effects that Marcus et al. report can be readily accounted for by assuming that the infants’ behavioral responses are based on statistical learning, similar to the above connectionist model.

All too often statistical-learning approaches – including connectionist models – are forced into a behavioristic mold only input–output relations are said to matter. However, the proponents of connectionist-style statistical learning have also taken part in the cognitive revolution and therefore posit internal representations mediating between input and output. Indeed, the internal representations of the above model provided a crucial source of information for the modeling of the infants’ behavior in the Marcus et al. study. Another oversight of the critics of connectionism relates to the importance of integrating multiple sources of information within a single statistical-learning device (9,12). It was this kind of information integration that enabled the above model to explain the infants’ preference for the inconsistent items because its performance did not rely only on phonological information. Thus, a more sophisticated approach to statistical learning and connectionist modeling is needed to reveal their true power. Once such an approach is adopted it becomes clear that the impressive learning abilities of the infants in the Marcus et al. study do not require the postulation of abstract algebraic rules.

References

1 Reber, A.S. (1969) Transfer of syntactic
structure in synthetic languages J. Exp. Psychol. 81, 115–119

2 Marcus, G.F. et al. (1999) Rule learning in
seven-month-old infants Science 283, 77–80

3 Redington, M. and Chater, N. (1996) Transfer
in artificial grammar learning: a reevaluation
J. Exp. Psychol. Gen. 125, 123–138

4 Marcus, G.F. (1999) Do infants learn grammar
with algebra or statistics? Response to
Seidenberg & Elman, Negishi, and Eimas
Science 284, 436–437

5 Marcus, G.F. (1999) Connectionism: with or
without rules? Response to J.L. McClelland
and D.C. Plaut (1999) Trends Cognit. Sci. 3,
168–170

6 Seidenberg, M.S. and Elman, J.L. (1999) Do
infants learn grammar with algebra or
statistics? Letter Science 284, 434–435

7 McClelland, J.L. and Plaut, D.C. (1999) Does
generalization in infant learning implicate
abstract algebra-like rules? Trends Cognit.
Sci. 3, 166–168

8 Elman, J.L. (1990) Finding structure in time
Cognit. Sci. 14, 179–211

9 Christiansen, M.H., Allen, J. and Seidenberg,
M.S. (1998) Learning to segment using
multiple cues: a connectionist model Lang.
Cognit. Processes 13, 221–268

10 Christiansen, M.H. and Curtin, S.L. The power
of statistical learning: no need for algebraic
rules Proc. 21st Annu. Conf. Cognit. Sci. Soc.
(Mahwah, NJ), Erlbaum (in press) (a Web-based
version can be found at 
http://cnl.psych.cornell.edu/abstracts/no_need_algebraic.html)

11 Pinker, S. (1999) Out of the minds of babes
Science 283, 40–41

12 Redington, M. and Chater, N. (1997)
Probabilistic and distributional approaches to
language acquisition Trends Cognit. Sci. 1,
273–281

Click to request a copy of this paper.

rule.gif (155 bytes)

Home | People | Research | Links | Contact | Publications | Presentations
Cognitive Neuroscience Laboratory

Please email suggestions/errors to mhc27@cornell.edu