Coping with Variation in Speech Segmentation

Morten H. Christiansen & Joseph Allen

Program in Neural, Informational and Behavioral Sciences
University of Southern California
Los Angeles, CA 90089-2520

morten@gizmo.usc.edu - joeallen@gizmo.usc.edu

Abstract

One of the first tasks that a child is faced with in language acquisition is the segmentation of the speech stream into words. The lack of the acoustic equivalents of white spaces in written text makes this a nontrivial task. Recent computational models of early speech segmentation have utilized the integration of multiple probabilistic cues to address this problem. Models by Aslin, Woodward, LaMendola & Bever (1996) and Brent & Cartwright (1996) achieved a good level of performance using a combination of phonology and utterance boundary information. Christiansen, Allen & Seidenberg (in press) showed that combining these two cues with information about lexical stress resulted in improved performance. However, the input to these models abstracted away from many important aspects of real language. The question remains as to how such computational models will fare when exposed to input more closely approximating the variation characteristic of actual speech.

Working from the insight that much of this variation is systematic, we present an investigation of the effects of variation on a connectionist model of infant speech segmentation. One type of variation is coarticulation, where segments vary on the basis of surrounding material. Previous computational models have used corpora in which every instance of a particular word always had the same phonological form. In contrast, we employ a phonetically transcribed corpus in which the phonological form of a word varies with its context.

Another way which the input signal varies is in the fact that individual segments of what is transcribed as the same phoneme actually vary considerably in their acoustic realization. Earlier models, such as Cairns, Shillcock, Chater & Levy (1994), modeled this variation by flipping random features with a certain probability. However, the variation in acoustic realization does not vary randomly; rather, for any segment certain features are more susceptible to change than others. Taking these ideas into account, we introduce a novel approach to modeling such segmental variation.

We present results from simulations involving simple recurrent networks trained on input consisting of segmental features, utterance boundary information, and lexical stress. The results show that our model performs well on the segmentation task - despite being faced with input characterized by considerable variation. This outcome is important because it shows that networks provide a robust mechanism for the integration of multiple cues even under less idealized conditions, and how such integration may form the basis of early speech segmentation. Similarities between the segmentation task and other aspects of language acquisition suggest that this notion of integration may also be usefully applied to the investigation of learning in other linguistic domains.


References

Aslin, R. N., Woodward, J. Z., LaMendola, N. P., & Bever, T. G. (1996). Models of word segmentation in fluent maternal speech to infants. In J. L. Morgan & K. Demuth (Eds), Signal to Syntax, pp. 117-134. Mahwah, NJ: Lawrence Erlbaum Associates.

Brent, M.R. & Cartwright, T.A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61, 93-125.

Cairns, P., Shillcock, R., Chater, N. & Levy, J. (1994). Lexical segmentation: The role of sequential statistics in supervised and un-supervised models. In Proceedings of the 16th Annual Conference of the Cognitive Science Society, pp. 136-141. Hillsdale, NJ: Lawrence Erlbaum Associates.

Christiansen, M.H., Allen, J. & Seidenberg, M.S. (in press). Learning to Segment Speech Using Multiple Cues: A Connectionist Model. Language and Cognitive Processes.

Click to request a copy of this paper.

rule.gif (155 bytes)

Home | People | Research | Links | Contact | Publications | Presentations
Cognitive Neuroscience Laboratory

Please email suggestions/errors to mhc27@cornell.edu