| Home
| People
| Research
| Links | Contact | Publications
|
Presentations |
 |  |
Coping with Variation in Speech Segmentation
Morten H. Christiansen & Joseph Allen
Program in Neural, Informational and Behavioral Sciences
University of Southern California
Los Angeles, CA 90089-2520
morten@gizmo.usc.edu - joeallen@gizmo.usc.edu
Abstract
One of the first tasks that a child is faced with in language
acquisition is the segmentation of the speech stream into words. The
lack of the acoustic equivalents of white spaces in written text makes
this a nontrivial task. Recent computational models of early speech
segmentation have utilized the integration of multiple probabilistic
cues to address this problem. Models by Aslin, Woodward, LaMendola &
Bever (1996) and Brent & Cartwright (1996) achieved a good level
of performance using a combination of phonology and utterance boundary
information. Christiansen, Allen & Seidenberg (in press) showed that
combining these two cues with information about lexical stress
resulted in improved performance. However, the input to these models
abstracted away from many important aspects of real language. The
question remains as to how such computational models will fare when
exposed to input more closely approximating the variation
characteristic of actual speech.
Working from the insight that much of this variation is systematic, we
present an investigation of the effects of variation on a
connectionist model of infant speech segmentation. One type of
variation is coarticulation, where segments vary on the basis of
surrounding material. Previous computational models have used corpora
in which every instance of a particular word always had the same
phonological form. In contrast, we employ a phonetically transcribed
corpus in which the phonological form of a word varies with its
context.
Another way which the input signal varies is in the fact that
individual segments of what is transcribed as the same phoneme
actually vary considerably in their acoustic realization. Earlier
models, such as Cairns, Shillcock, Chater & Levy (1994), modeled this
variation by flipping random features with a certain
probability. However, the variation in acoustic realization does not
vary randomly; rather, for any segment certain features are more
susceptible to change than others. Taking these ideas into account, we
introduce a novel approach to modeling such segmental variation.
We present results from simulations involving simple recurrent
networks trained on input consisting of segmental features, utterance
boundary information, and lexical stress. The results show that our
model performs well on the segmentation task - despite being faced
with input characterized by considerable variation. This outcome is
important because it shows that networks provide a robust mechanism
for the integration of multiple cues even under less idealized
conditions, and how such integration may form the basis of early
speech segmentation. Similarities between the segmentation task and
other aspects of language acquisition suggest that this notion of
integration may also be usefully applied to the investigation of
learning in other linguistic domains.
References
Aslin, R. N., Woodward, J. Z., LaMendola, N. P., & Bever,
T. G. (1996). Models of word segmentation in fluent maternal speech to
infants. In J. L. Morgan & K. Demuth (Eds), Signal to Syntax,
pp. 117-134. Mahwah, NJ: Lawrence Erlbaum Associates.
Brent, M.R. & Cartwright, T.A. (1996). Distributional
regularity and phonotactic constraints are useful for
segmentation. Cognition, 61, 93-125.
Cairns, P., Shillcock, R., Chater, N. & Levy, J. (1994).
Lexical segmentation: The role of sequential statistics in supervised
and un-supervised models. In Proceedings of the 16th Annual
Conference of the Cognitive Science Society, pp. 136-141. Hillsdale,
NJ: Lawrence Erlbaum Associates.
Christiansen, M.H., Allen, J. & Seidenberg, M.S. (in
press). Learning to Segment Speech Using Multiple Cues: A
Connectionist Model. Language and Cognitive Processes.
Click to request a copy of this paper.
