Speech Talk (formerly Splunch) is the phonetics lab's weekly lunch meeting. This is an informal meeting where we discuss current research topics on all things related to phonetics, phonology, and the study of speech and communication in general. The usual format involves short presentations of works in progress. However, we also devote some time to discussion of hot topics in the field.
Our diverse working group involves faculty and students from departments of linguistics, psychology, biology, computer science and cognitive science. Everyone with an interest in speech is welcome to attend.
This study investigated the prosodic encoding and perception of focus in English and Korean. In an attempt to compare these two languages in the same environment with one exception factor – prosody – this study conducted a production experiment using two sets of stimuli: a set of 100 10-digit strings with no corrections (i.e. broad-focus condition) and the exactly same set of 100 10-digit strings based on a Q-A dialogue to elicit a corrective-focus condition. A phone number in the broad-focus condition was then directly compared with the same number in the corrective-focus condition by the aggregate measures of maximum F0 (st), mean F0 (st), duration (ms) and mean intensity (dB). Furthermore, fifteen English and twenty-three Korean listeners participated in the pilot perception experiment to determine whether or not listeners in each language can correctly identify the corrected item.
In English, corrective focus significantly increased F0, duration, and intensity. Since the prosodic effect of focus was prominent, English listeners were easily able to identify the corrective-focus digit. All in all, the accuracy rate of identifying the corrective-focus digit showed 97.1%. In contrast, as for Korean, although corrective focus marginally increased a higher pitch range than its broad-focus counterpart, duration and intensity did not show any significant association with corrective focus. The overall accuracy rate exhibited as low as 41.7%.
The results of this study uncover two important phenomena in Korean. First, the prosodic effect of focus was dependent on the segment type – a corrective-focus digit was realized with a higher F0 and a longer duration when a word-initial segment was [+spread glottis] and when a word began with a high-toned vowel. These segments are said to boost the prosodic effect of focus. Second, the prosodic effect of focus showed little variance regardless of focus position, i.e. phrase-initial, phrase-medial, or phrase-final. A corrective-focus digit in phrase-medial position, such as the second digit 6 of 267, was certainly realized with more increased, albeit marginal, pitch range than its broad-focus counterpart. However, due to the phrase-final boundary tone in Korean, a digit in phrase-final position usually produced a higher pitch value than the one in phrase-medial position. This is why the correction rate turned out to be even lower in phrase-medial position than elsewhere. Accordingly, it can be said that the focus effect in this position was masked by the subsequent boundary tone.
This study reports two interesting
patterns of one recent sound change in Seoul Korean (SK): the High-on [il] (Jun and Cha 2011). SK is known to show LHLH phrasal tonal patterns in Accentual Phrases (AP), unless an AP starts with an aspirated or tensed consonant (Jun 2000). However, Jun and Cha (2011) report that an AP-initial [il] sometimes is realized with a H tone. Their findings are i) speakers younger than mid 40s are more likely to produce a H tone on [il], ii) among three meanings of [il], no.1, day, and work, [il] meaning no. 1 is the most frequently H-toned, and iii) the phenomenon is not likely to be due to the preceding glottal stop before [il]. However, the present study finds that there are actually two patterns in the realization of the phenomenon, and a preceding partial glottal stop is a trigger for a H tone.
I'll try to incorporate a bit of the Oneida stuff, mainly for its underlying idea of the need for a feed-forward model. Then I'll explain how a more robust feed-forward model of the body can work, and what that means for how we think about the mind-body interface for speech - with implications for sound change, phonetic universals, L1 bootstrapping, etc. (if we have time)
Vowel chain shift, as a series of related sound changes leading to a rearrangement of the phonetic realizations, is a typical phonetic change in world languages, yet the explanation for it remains controversial; some scholars highlight the roles of self-organizing property of the vowel space, whereas others emphasize the necessity of phonetic contrast maintenance. In this talk, based primarily on Bart de Boer's self-organizing model and the empirical data of vowel chain shifts in Xumi language, I present an agent-based computer simulation to address this controversy and explain how the vowel chain shift in Upper Xumi occurs. The simulation results show that extended vowel chain shift cannot be solely explained by self-organization, the phonemic contrast maintenance mechanism is also necessary. Under these two factors, the vowel chain shift in Upper Xumi can be explaned as the combined effect of the evolution of vowel system under noise conditions and the addition of the loan phoneme /ɔ/. This is the first simulation attempt to address the process of an extended vowel shift in relation to real-world data.
This study presents acoustic evidence of how an individual speaker’s vowels change across their lifespan. Noam Chomsky was chosen as a speaker because he presents an excellent opportunity to study the effect of relocation to a different dialect area on adult phonology as he was born and raised in Philadelphia and moved to Boston at his age 26. Two linguistic variables that have different phonemic systems in Philadelphia and Boston were focused on in this study: 1) The /o/–/oh/ distinction in Philadelphia and /o/–/oh/ merger in Boston, 2) The split short-a system (a phonemic distinction between tense /{h/ and lax /{/ with various phonological and lexical conditioning) in Philadelphia and the nasal short-a system (an allophonic alternation between tense /æh/ before nasals and lax /æ/ before non-nasals) in Boston [1]. Two sets of recordings of Chomsky’s speaking publicly in 1970 and 2009 were transcribed and force-aligned, then F1 and F2 of /o/ and /oh/ (964 tokens) and /{/ (844 tokens) were extracted using the FAVE suite [2] and analyzed using R [3].
Mixed effect regression analyses show that Chomsky’s /o/ has significantly raised and backed over 40 years while /oh/ remained stable. Although his /o/ shifted significantly towards /oh/, the two phonemes remained distinct in 2009 (see Figure 1). The raising and backing of /o/ towards /oh/ in the direction of the Boston merger suggests that Chomsky was able to adopt features of a new dialect over time as a result of contact with this second dialect well past the critical period.
Chomsky’s short-a patterns also provide an interesting picture of an adult speaker’s accent shift. In 1970, almost all /{/ tokens were lax, displaying neither a split nor nasal system. A closer look at the distribution reveals that i) /{/ only before front nasals in closed syllable significantly raised and fronted, ii) /{/ before voiceless fricatives and in three lexical item bad, mad, glad marginally raised and fronted.
Two possible interpretations are provided. First, Chomsky might have abandon the split short-a system which he acquired as his native system possibly due to low social values associated with the extremely tensed variant of /{/. Later in his life, he might have reverted to the split system, relaxing the linguistic norm of the dominant society. Second, it is proposed that the significant raising and fronting of /{/ resulted from increased coarticulatory nasalization effects [4]. The nasality of /æ/ before front nasals in closed syllable was measured in A1-P0, which indicates how oral or nasal a sound is. This showed that his nasality had increased significantly between 1970 and 2009 in this environment (t=6.7402, p<0.001). We propose that this may be a case of hypo-correction where Chomsky, when restructuring /æ/ produced by other speakers, may have attributed the coarticulatory nasality to raising and fronting of /{/, failing to undo the coarticulatory effect.
Liaison is a sandhi phenomenon in French which triggers the surfacing of a limited set of latent consonants before certain vowel-initial words. As French rarely allows bare nouns, children are faced with two challenging tasks: retrieving vowel-initial nouns from conflicting input contexts (les amis =>/le.zami/ ‘the friends’, un ami =>/œ̃.nami/ ‘a friend’, /lami/ ‘the friend’, /pəәti.tami/ ‘boyfriend’, but only rarely /joli.ami/ ‘pretty friend’) and subsequently manipulating these underlying forms to produce them in well-formed liaison utterances. The recent acquisition literature explores the influence of phonetic cues, statistical strategies, and universal principles of language in helping children resolve the problem of liaison processing and production. This talk will provide a brief overview of current theories of segmentation and production as well as present the results of an elicited production task conducted with French-learning infants age 3;0 to investigate the effects of distributional properties in the input and abstract knowledge on early liaison productions in the nominal domain.
I'll describe these two commonly used speech processing methods, what they are and some ways in which they are used. I'll concentrate on intuitions rather than equations, but can't avoid equations altogether. You are assumed to have met (though not necessarily liked) sines in high school math.
Priming is the facilitation in use or recognition of a recently-processed linguistic item. In sociolinguistic variation, facilitation can be understood as an increased probability of reusing a recently-used variant. Clustering effects detected in sociolinguistic variation in conversational speech have generally been attributed to priming (Weiner & Labov 1983, Scherre & Naro 1991, Gries 2005, Szmrecsanyi 2006). However, the possibility that such clustering results simply from style-shifting rather than true facilitation has not been convincingly ruled out. Against the backdrop of the experimental priming literature, I compare the results from a simulation of style-shifting to the observed clustering patterns in ING and TD in the Philadelphia Neighborhood Corpus. I argue that clustering derived from style-shifting and clustering resulting from priming can both be detected and disentangled quantitatively. I then suggest a range of possibilities for studying morphophonological variation experimentally using matched guise and lexical decision techniques.
In this talk I will present some (not quite settled) work on sentential prosody. More specifically, I'll present results from production and perception experiments investigating the intonation of different types of declarative responses: direct and indirect affirmations/contradictions and declarative questions. On the way we'll look at some methods for describing the differences in the data. In particular, I'll briefly present some dabblings with functional principal components analysis. The point of this was basically to see whether these different types of discourse moves are associated with particular prosodic forms, and if so, whether these could be mapped to specific types of information structural units. Various frameworks have attached quite specific meaning mappings to intonational shapes in this way. For example, Buring (2003) has argued that fall-rise type accents specifically mark contrastive topics.
The data collected in this experiment indicated that fall-rise type accents were often produced in indirect response contexts. However, while this type of accent often falls on a contrastive element in the IS ground, it doesn't have to. In fact, the main difference between direct and indirect responses seems to be that indirect responses contain a non-polarity contrast. I argue that this in turn indicates a type of discourse structure (strategy) that would make the response congruent to the current question under discussion. This seems to require prominence, but not a particular accent type (c.f. Calhoun 2006). Moreover, a perception experiment indicates that the presence of an actual rise in these fall-rise contexts is not really important. However, larger pitch gestures do appear change the perceived level of speaker engagement. This (happily!) echoes some of my previous work on prosody/semantics interaction of cue words like `really'.
Montreal French is known for having a process of diphthongization that affects long vowels: both those vowels that are inherently long and those that are lengthened due to a following lengthening consonant ([R] or one of the voiced fricatives). We examine the acoustic correlates of these diphthongs and situate them with regard to the Montreal French vowel system as a whole. We find that, as previously acknowledged, vowels in closed syllables are lower than vowels in open syllables; diphthongized vowels in lengthening contexts display a lowered nucleus and an offglide at roughly the same location as the non-lengthened closed-syllable vowel.
Focusing on two vowels in particular, lengthened [E] and lengthened [?], we show that lengthening consonants do not affect each vowel identically: any one of the aforementioned lengthening consonants will cause a preceding [E] to diphthongize, while only [R] has that effect on [?] (and thus words such as ?uvre, (heu)reuse display no diphthongization). We also examine the effect of length on diphthongization and find that some speakers show a robust correlation between length and diphthongization: longer vowels tend to display a greater change between nucleus and offglide. Finally, we provide results from a longitudinal analysis of several speakers' diphthongization rates over time, in an examination of speakers' potential to alter their vowel systems over their lifetimes.