Learning to Segment Speech Using Multiple Cues: A
Connectionist Model
Morten H. Christiansen, Joseph Allen and Mark S. Seidenberg
Abstract
Considerable research in language acquisition has addressed the extent
to which basic aspects of linguistic structure might be identified on
the basis of probabilistic cues in caregiver speech to children. This
type of learning mechanism presents classic learnability issues: there
are aspects of language for which the input is thought to provide no
evidence, and the evidence that does exist tends to be unreliable. We
address these issues in the context of the specific problem of
learning to identify lexical units in speech. A simple recurrent
network was trained on a phoneme prediction task. The model was
explicitly provided with information about phonemes, relative lexical
stress, and boundaries between utterances. Individually these sources
of information provide relatively unreliable cues to word boundaries
and no direct evidence about actual word boundaries. After training
on a large corpus of child directed speech, the model was able to use
these cues to reliably identify word boundaries. The model shows that
aspects of linguistic structure that are not overtly marked in the
input can be derived by efficiently combining multiple probabilistic
cues. Connectionist networks provide a plausible mechanism for
acquiring, representing, and combining such probabilistic information.