Continuous categories strengthen diachronic theory

Whitney Tabor

Fri. 10:00-11:40 B

Early variable rule models (e.g., Weinreich, Labov, and Herzog 1968, Wolfram 1969, Labov 1969) proposed that quantitative information should be included in grammatical description, although they put little constraint on what rules could be variable and what probabilities could be assigned to the rules. Recent studies in historical syntax (e.g., Kroch 1989, Pintzuk 1991, Taylor 1994) push the notion much further by hypothesizing that probabilities are associated with highly abstract syntactic parameters. Thus all the linguistic expressions a given parameter controls are expected to exhibit parallel quantitative change. Even these "Variable Parameter" models, however, stop short of proposing a link between quantitative properties of the data and the choice about which parameters to activate. In this sense, quantitative and categorical information remain neatly separated.

By contrast, Tabor 1995 describes a model which predicts strong interaction between quantitative and categorical features of grammatical representation. This model, implemented in a Connectionist network, may aptly be called a "Continuous Category Model" (CCM) for it replaces the discrete categories of standard grammars with clusters of points in a continuous space. Two predictions about syntactic change distinguish CCMs from the Variable Parameter models: (1) persistent quantitative change can lead to categorical change; (2) the ordering of categorical changes will reflect the distributional similarity structure of the data---if type B is intermediate between type A and type C, then a change from A to C will proceed via B. These two claims are not at odds with the Variable Parameter models but they reflect a further strengthening of the variationist hypothesis, since those models do not make any predictions about the relative timing of categorical changes. Evidence for claim (1) has been provided by Tabor 1993, 1994, and 1995. Here, I show how data from the history of the English gerund support claim (2).

Many researchers have found evidence that over the course of the Middle English period, the gerund acquired an increasingly "verbal" character (e.g., Poutsma 1923, Mosse 1938, Tajima 1985). The -ung ancestor of the modern -ing first spread across a wider and wider set of verbs (late OE), then began to occur in conjuction with prepositional complements (c. 1200), then with direct objects (c. 1300), and ultimately spread to passive and perfect constructions (late 1500's). Abney 1987 notes that this chronology is consistent with a proposal by Jackendoff 1977 that the history of the gerund involved the accommodation of a series of successively more abstract VPs under the scope of the gerundial affix. I show how a CCM implemented in a Connectionist network and trained on the distributional data generated by Jackendoff/Abney's grammar, correctly predicts the ordering of the changes. This result provides evidence that measurement of relative similarity (the bread and butter of CCMs) is a crucial ingredient in grammatical representation. It also indicates that variable parameter models would do well to replace the parameter independence assumption with a claim about parameter interaction. Finally it illustrates a method for combining the structural insights of standard, discrete category grammars with the historical modeling advantages of CCMs.