In M.H. Christiansen & N. Chater (Eds.), Connectionist psycholinguistics (pp.1-15). Westport, CT: Ablex.

Connectionist psycholinguistics: The very idea



Morten H. Christiansen


Introduction

What is the signi cance of connectionist models of language processing? Will connectionism ultimately replace, complement or simply implement the symbolic approach to language? Early connectionist models attempted to address this issue by showing that connectionist models could, in principle, capture aspects of language processing and linguistic structure. Little attention was generally paid to the modeling of data from psycholinguistic experiments. However, we suggest that connectionist language processing has matured and that the eld is now moving forward into a new phase in which closer attention is paid to detailed psycholinguistic data. This book provides the rst comprehensive overview of work within the emergent field of "connectionist psycholinguistics"--connectionist models that make close contact with psycholinguistic results.

But how are we to assess the models within this emerging new area of research? We suggest that computational models of psycholinguistic processing, whether connectionist or symbolic, should attempt to fulfill three criteria: a) data contact, b) task veridicality, and c) input representativeness (Christiansen & Chater, 2000). Data contact refers to the degree to which a model provides a fit with psycholinguistic data. We distinguish here between primary and secondary data contact. Primary data contact involves tting results from specific psycholinguistic experiments (e.g., reaction time data), whereas secondary data contact involves fitting general patterns of behavior (e.g., experimentally attested developmental changes in language processing) rather than specific results. Task veridicality refers to the degree of match between the task facing people and the task given to the model. Although a precise match is typically difficult to obtain, it is important to minimize the discrepancy. For example, much early work on modeling the English past tense suffers from low task veridicality (e.g., Rumelhart & McClelland, 1986|but see e.g., Hoeffner, 1997, for an exception) because models are trained to map verb stems to past tense forms, a task unlikely to be relevant to children's language acquisition. Input representativeness refers to the degree to which the information given to the model re ects what is available to a person or child. For example, the computational modeling of morphology suffers from the lack of good training corpora of high input representativeness with which to train the models. This problem is most serious for non-English morphology, making it problematic to make a priori conclusions about the feasibility of connectionist accounts in the area (e.g., Berent, Pinker & Shimron, 1999).

It is important also to take stock of where symbolic models stand on our three criteria for computational psycholinguistics. Interestingly, few symbolic models make direct contact with psycholinguistic data. Most of the exceptions are within the study of sentence processing where some comprehensive models of word-by-word reading times exist (e.g., Gibson, 1998; Just & Carpenter, 1992), and have a reasonable degree of task veridicality. More generally, however, symbolic models appear to pay little attention to task veridicality. Indeed, the rule-based models of the English past tense (e.g., Pinker, 1991) involve the same stem-to-past-tense mappings as the early connectionist models, and thus suffer from the same low task veridicality. Input representativeness is often ignored in symbolic models, in part because learning plays a minimal role in the performance of these models, and in part because symbolic models tend to be focused on more abstract fragments of language, rather than the more realistic language input that some connectionist models can handle. Low input representativeness may, for these reasons, actually inflate performance for many types of symbolic models, whereas the opposite tends to be true of connectionist models.

Currently, then, connectionism appears to provide a better framework for detailed psycholinguistic modeling than does the symbolic approach. For many connectionists, the advantages of this framework for doing computational psycholinguistics derive from a number of properties of the connectionist models: Learning. Connectionist networks typically learn from experience, rather than being fully prespecifed by a designer. By contrast, symbolic computational systems, including those concerned with language processing, are typically, but not always, fully speci ed by the designer.

Generalization. Few aspects of language are simple enough to be learnable by rote. The ability of networks to generalize to cases on which they have not been trained is thus a critical test for many connectionist models.

Representation. Because they are able to learn, the internal codes used by connectionist networks need not be fully specified by a designer, but are devised by the network so as to be appropriate for the task. Developing methods for understanding the codes that the network develops is an important strand of connectionist research. While internal codes may be learned, the inputs and outputs to a network generally use a code specified by the designer. These codes can be crucial in determining network performance. How these codes relate to standard symbolic representations of language in linguistics is a major point of contention.

Rules vs. Exceptions. Many aspects of language can be described in terms of what have been termed "quasi-regularities"--regularities which are usually true, but which admit some exceptions. According to the symbolic descriptions used by modern linguistics, these quasi-regularities may be captured in terms of a set of symbolic rules, and sets of exceptions to those rules. Symbolic models often incorporate this distinction by having separate mechanisms which deal with rule-governed and exceptional cases. It has been argued that connectionist models provide a single mechanism which can pick up general rules, while learning the exceptions to those rules. While this issue has been a major point of controversy surrounding connectionist models, it is important to note that attempting to provide single mechanisms for rules and exceptions is not essential to the connectionist approach; one or both separate mechanisms for rules and exceptions could themselves be modeled in connectionist terms (Coltheart, Curtis, Atkins & Haller, 1993; Pinker, 1991; Pinker & Prince, 1988). A further question is whether networks really learn rules at all, or whether they simply approximate rule-like behavior. Opinions di er concerning whether the latter is an important positive proposal, which may lead to a revision of the role of rules in linguistics (Rumelhart & McClelland, 1986; see also Smolensky, 1988), or whether it is a fatal problem with connectionist models of language processing (Marcus, 1998; Pinker & Prince, 1988).

These four properties all play an important role in the models described in Part I of this volume as well as in the appraisals of connectionist psycholinguistics presented in Part II.


Click here to download a PDF version.

rule.gif (155 bytes)

Home | People | Research | Links | Contact | Publications | Presentations
Cognitive Neuroscience Laboratory

Please email suggestions/errors to mhc27@cornell.edu