Paper presented at the 4th International Conference on the Evolution of Language. Cambridge, MA.

Multiple-cue integration and the evolution of languages



Rick Dale & Morten H. Christiansen


Abstract

Children learn language rather easily. So easily, in fact, that it remains almost a mystery how the initial hurdles to the acquisition of linguistic structure, seemingly insurmountable in their complexity, can be overcome in such short time and with such reliability. Recent work in developmental psycholinguistics offers a new approach to understanding this problem: Languages contain statistical cues to different levels of linguistic structure. These cues are only partially reliable individually, but when integrated in language acquisition, they offer a wealth of clues to linguistic structure. Christiansen and Dale (2001) offered a computational model of this process. Word length (Kelly & Cassidy, 1991), lexical stress (Shi et al., 1999), and pauses (Fisher & Tokura, 1996) were cues processed by simple recurrent networks (SRNs) required to learn an English-like language generated by an artificial grammar of child-directed speech. The networks that received all three cues simultaneously performed significantly better than networks with one or no cues.

Much evidence suggests that such cues are present cross-linguistically (Kelly, 1992), and are manifested in different combinations or "cue constellations." Our hypothesis is that in order to for languages to increase their linguistic complexity without compromising learnability, they have evolved constellations of cues that reflect their respective structure, and cater to cognitive constraints imposed by the child's learning mechanisms. In this talk, we offer computational simulations taking a first step toward showing the emergence of multiple-cue integration in language evolution.

A phrase-structure grammar template was used to generate sentences along with associated cues for the training of SRNs. The template incorporated a number of parameters that could be "mutated" across generations of networks. These parameters included (1) head ordering of phrase-structure rules (right-headed vs. left-headed rules), (2) places at which pauses could delimit structure (sentence vs. noun-phrase pauses), and (3) lexical cues (units devoted to words in the grammar). For each generation of networks, the template instantiation for which the SRNs acquired the most linguistic structure was selected as the winning grammar, and had some of its parameters mutated. Three such grammar templates formed the basis of three different runs of this simulation, each differing in their lexical complexity.

Our results offer clues about the relevance of different cues in the evolution of individual languages. First, across all simulations performance improved across generations. Second, pauses were consistently located regardless of lexical complexity. This suggests that cues relevant to syntactic phrases are important for emerging languages with even impoverished vocabulary. Third, the lexical level cue fluctuated significantly more in the lexically simplest language, and became consistent in the most complex. It seems, therefore, that certain cues are more consistently exploited as languages become more complex. To conclude, we argue that cues emerge to service growing linguistic structure. Fueled by constraints on learning, cue integration becomes a vehicle for the facilitation of the acquisition of complex linguistic structure. Languages employing cues thus become more likely to survive the processes of cultural transmission across generations.


rule.gif (155 bytes)

Home | People | Research | Links | Contact | Publications | Presentations
Cognitive Neuroscience Laboratory

Please email suggestions/errors to mhc27@cornell.edu