|
The Secret Is in the Sound:
From Unsegmented Speech to Lexical Categories
Morten H. Christiansen, Luca Onnis, and Stephen Hockema
Abstract
When learning language young children are faced with many seemingly formidable
challenges, including discovering words embedded in a continuous stream of
sounds and determining what role these words play in syntactic constructions.
We suggest that knowledge of phoneme distributions may play a crucial
part in helping children segment words and determining their lexical
category. We performed a two-step analysis of a large corpus of English
child-directed speech. First, we used transition probabilities between
phonemes to find words in unsegmented speech. Second, we used
distributional information about word edges, the beginning and ending
phonemes of words, to predict whether the segmented words were nouns,
verbs, or something else. These results indicate that discovering
lexical units and their associated syntactic category in child-directed
speech is possible by attending to the statistics of single phoneme
transitions and word-initial and final phonemes. Thus, we propose that
a core computational principle in language acquisition is that the same
source of information is used to learn about different aspects of
linguistic structure.

|