Speech Talk at the UPenn Phonetics Lab

Speech Talk

Mondays at 10:30 AM (Spring 2013)

The Phonetics Lab, 623 Williams Hall

Speech Talk (formerly Splunch) is the phonetics lab's weekly lunch meeting. This is an informal meeting where we discuss current research topics on all things related to phonetics, phonology, and the study of speech and communication in general. The usual format involves short presentations of works in progress. However, we also devote some time to discussion of hot topics in the field.

Our diverse working group involves faculty and students from departments of linguistics, psychology, biology, computer science and cognitive science. Everyone with an interest in speech is welcome to attend.

Spring 2014 Schedule

Feb 24, 2014: Prosodic encoding and perception of focus in Korean and English

Yong-cheol Lee, University of Pennsylvania

This study investigated the prosodic encoding and perception of focus in English and Korean. In an attempt to compare these two languages in the same environment with one exception factor – prosody – this study conducted a production experiment using two sets of stimuli: a set of 100 10-digit strings with no corrections (i.e. broad-focus condition) and the exactly same set of 100 10-digit strings based on a Q-A dialogue to elicit a corrective-focus condition. A phone number in the broad-focus condition was then directly compared with the same number in the corrective-focus condition by the aggregate measures of maximum F0 (st), mean F0 (st), duration (ms) and mean intensity (dB). Furthermore, fifteen English and twenty-three Korean listeners participated in the pilot perception experiment to determine whether or not listeners in each language can correctly identify the corrected item.

In English, corrective focus significantly increased F0, duration, and intensity. Since the prosodic effect of focus was prominent, English listeners were easily able to identify the corrective-focus digit. All in all, the accuracy rate of identifying the corrective-focus digit showed 97.1%. In contrast, as for Korean, although corrective focus marginally increased a higher pitch range than its broad-focus counterpart, duration and intensity did not show any significant association with corrective focus. The overall accuracy rate exhibited as low as 41.7%.

The results of this study uncover two important phenomena in Korean. First, the prosodic effect of focus was dependent on the segment type – a corrective-focus digit was realized with a higher F0 and a longer duration when a word-initial segment was [+spread glottis] and when a word began with a high-toned vowel. These segments are said to boost the prosodic effect of focus. Second, the prosodic effect of focus showed little variance regardless of focus position, i.e. phrase-initial, phrase-medial, or phrase-final. A corrective-focus digit in phrase-medial position, such as the second digit 6 of 267, was certainly realized with more increased, albeit marginal, pitch range than its broad-focus counterpart. However, due to the phrase-final boundary tone in Korean, a digit in phrase-final position usually produced a higher pitch value than the one in phrase-medial position. This is why the correction rate turned out to be even lower in phrase-medial position than elsewhere. Accordingly, it can be said that the focus effect in this position was masked by the subsequent boundary tone.

Fall 2013 Schedule

Nov 19, 2013: The High-on [il] in Seoul Korean: a lexical diffusion or a phonological rule?

Sunghye Cho, University of Pennsylvania

This study reports two interesting
patterns of one recent sound change in Seoul Korean (SK): the High-on [il] (Jun and Cha 2011). SK is known to show LHLH phrasal tonal patterns in Accentual Phrases (AP), unless an AP starts with an aspirated or tensed consonant (Jun 2000). However, Jun and Cha (2011) report that an AP-initial [il] sometimes is realized with a H tone. Their findings are i) speakers younger than mid 40s are more likely to produce a H tone on [il], ii) among three meanings of [il], no.1, day, and work, [il] meaning no. 1 is the most frequently H-toned, and iii) the phenomenon is not likely to be due to the preceding glottal stop before [il]. However, the present study finds that there are actually two patterns in the realization of the phenomenon, and a preceding partial glottal stop is a trigger for a H tone.

Nov 4, 2013: Modularity of body; multidimensionality of mind

Bryan Gick, Department of Linguistics, University of British Columbia

I'll try to incorporate a bit of the Oneida stuff, mainly for its underlying idea of the need for a feed-forward model. Then I'll explain how a more robust feed-forward model of the body can work, and what that means for how we think about the mind-body interface for speech - with implications for sound change, phonetic universals, L1 bootstrapping, etc. (if we have time)

Oct 29, 2013: Simulating Vowel Chain Shifts

Dr. Tao Gong, Department of Linguistics, University of Hong Kong

Vowel chain shift, as a series of related sound changes leading to a rearrangement of the phonetic realizations, is a typical phonetic change in world languages, yet the explanation for it remains controversial; some scholars highlight the roles of self-organizing property of the vowel space, whereas others emphasize the necessity of phonetic contrast maintenance. In this talk, based primarily on Bart de Boer's self-organizing model and the empirical data of vowel chain shifts in Xumi language, I present an agent-based computer simulation to address this controversy and explain how the vowel chain shift in Upper Xumi occurs. The simulation results show that extended vowel chain shift cannot be solely explained by self-organization, the phonemic contrast maintenance mechanism is also necessary. Under these two factors, the vowel chain shift in Upper Xumi can be explaned as the combined effect of the evolution of vowel system under noise conditions and the addition of the loan phoneme /ɔ/. This is the first simulation attempt to address the process of an extended vowel shift in relation to real-world data.

Oct 8, 2013: Vowel change across Noam Chomsky’s lifespan

Soohyun Kwon, University of Pennsylvania

This study presents acoustic evidence of how an individual speaker’s vowels change across their lifespan. Noam Chomsky was chosen as a speaker because he presents an excellent opportunity to study the effect of relocation to a different dialect area on adult phonology as he was born and raised in Philadelphia and moved to Boston at his age 26. Two linguistic variables that have different phonemic systems in Philadelphia and Boston were focused on in this study: 1) The /o/–/oh/ distinction in Philadelphia and /o/–/oh/ merger in Boston, 2) The split short-a system (a phonemic distinction between tense /{h/ and lax /{/ with various phonological and lexical conditioning) in Philadelphia and the nasal short-a system (an allophonic alternation between tense /æh/ before nasals and lax /æ/ before non-nasals) in Boston [1]. Two sets of recordings of Chomsky’s speaking publicly in 1970 and 2009 were transcribed and force-aligned, then F1 and F2 of /o/ and /oh/ (964 tokens) and /{/ (844 tokens) were extracted using the FAVE suite [2] and analyzed using R [3].
Mixed effect regression analyses show that Chomsky’s /o/ has significantly raised and backed over 40 years while /oh/ remained stable. Although his /o/ shifted significantly towards /oh/, the two phonemes remained distinct in 2009 (see Figure 1). The raising and backing of /o/ towards /oh/ in the direction of the Boston merger suggests that Chomsky was able to adopt features of a new dialect over time as a result of contact with this second dialect well past the critical period.
Chomsky’s short-a patterns also provide an interesting picture of an adult speaker’s accent shift. In 1970, almost all /{/ tokens were lax, displaying neither a split nor nasal system. A closer look at the distribution reveals that i) /{/ only before front nasals in closed syllable significantly raised and fronted, ii) /{/ before voiceless fricatives and in three lexical item bad, mad, glad marginally raised and fronted.
Two possible interpretations are provided. First, Chomsky might have abandon the split short-a system which he acquired as his native system possibly due to low social values associated with the extremely tensed variant of /{/. Later in his life, he might have reverted to the split system, relaxing the linguistic norm of the dominant society. Second, it is proposed that the significant raising and fronting of /{/ resulted from increased coarticulatory nasalization effects [4]. The nasality of /æ/ before front nasals in closed syllable was measured in A1-P0, which indicates how oral or nasal a sound is. This showed that his nasality had increased significantly between 1970 and 2009 in this environment (t=6.7402, p<0.001). We propose that this may be a case of hypo-correction where Chomsky, when restructuring /æ/ produced by other speakers, may have attributed the coarticulatory nasality to raising and fronting of /{/, failing to undo the coarticulatory effect.

Oct 1, 2013: Early Acquisition of Nominal Liaison in French

Angelica Buerkin-Salgado, University of Pennsylvania

Liaison is a sandhi phenomenon in French which triggers the surfacing of a limited set of latent consonants before certain vowel-initial words. As French rarely allows bare nouns, children are faced with two challenging tasks: retrieving vowel-initial nouns from conflicting input contexts (les amis =>/le.zami/ ‘the friends’, un ami =>/œ̃.nami/ ‘a friend’, /lami/ ‘the friend’, /pəәti.tami/ ‘boyfriend’, but only rarely /joli.ami/ ‘pretty friend’) and subsequently manipulating these underlying forms to produce them in well-formed liaison utterances. The recent acquisition literature explores the influence of phonetic cues, statistical strategies, and universal principles of language in helping children resolve the problem of liaison processing and production. This talk will provide a brief overview of current theories of segmentation and production as well as present the results of an elicited production task conducted with French-learning infants age 3;0 to investigate the effects of distributional properties in the input and abstract knowledge on early liaison productions in the nominal domain.

Sep 24, 2013: Tutorial introduction to linear prediction and cepstral analysis

Stephen Isard, University of Pennsylvania

I'll describe these two commonly used speech processing methods, what they are and some ways in which they are used. I'll concentrate on intuitions rather than equations, but can't avoid equations altogether. You are assumed to have met (though not necessarily liked) sines in high school math.

Sep 17, 2013: Sociolinguistic variation and priming: integrating natural, simulated, and experimental data

Meredith Tamminga, University of Pennsylvania

Priming is the facilitation in use or recognition of a recently-processed linguistic item. In sociolinguistic variation, facilitation can be understood as an increased probability of reusing a recently-used variant. Clustering effects detected in sociolinguistic variation in conversational speech have generally been attributed to priming (Weiner & Labov 1983, Scherre & Naro 1991, Gries 2005, Szmrecsanyi 2006). However, the possibility that such clustering results simply from style-shifting rather than true facilitation has not been convincingly ruled out. Against the backdrop of the experimental priming literature, I compare the results from a simulation of style-shifting to the observed clustering patterns in ING and TD in the Philadelphia Neighborhood Corpus. I argue that clustering derived from style-shifting and clustering resulting from priming can both be detected and disentangled quantitatively. I then suggest a range of possibilities for studying morphophonological variation experimentally using matched guise and lexical decision techniques.

Past Talks

Oct 18, 2011: What's Really Happening to Short A Before L in Philadelphia?

Aaron Dinkin, Swarthmore College

Abstract [pdf]

Oct 4, 2011: Language Change Across Jimmy Carter's Lifespan

Joe Fruehwald, University of Pennsylvania

When choosing a public figure to analyze for language changes across their lifetime, Jimmy Carter is a great candidate. He has maintained a media presence for nearly 40 years, so audio is relatively easy to come by. He also speaks a relatively well understood, and stigmatized dialect.

For my study, I selected 4 recordings. 2 from the late 70s (a DNC address, and a press conference), and 2 from the early 2000's (another DNC address, and a Q&A session). My analysis focused on the vowel system as a whole (as measured using forced alignment and extractFormants from the FAAV project), /ay/ monophthongization specifically, and /r/ vocalization.

Sep 27, 2011: What's up with sentential prosody?

Catherine Lai, University of Pennsylvania

In this talk I will present some (not quite settled) work on sentential prosody. More specifically, I'll present results from production and perception experiments investigating the intonation of different types of declarative responses: direct and indirect affirmations/contradictions and declarative questions. On the way we'll look at some methods for describing the differences in the data. In particular, I'll briefly present some dabblings with functional principal components analysis. The point of this was basically to see whether these different types of discourse moves are associated with particular prosodic forms, and if so, whether these could be mapped to specific types of information structural units. Various frameworks have attached quite specific meaning mappings to intonational shapes in this way. For example, Buring (2003) has argued that fall-rise type accents specifically mark contrastive topics.

The data collected in this experiment indicated that fall-rise type accents were often produced in indirect response contexts. However, while this type of accent often falls on a contrastive element in the IS ground, it doesn't have to. In fact, the main difference between direct and indirect responses seems to be that indirect responses contain a non-polarity contrast. I argue that this in turn indicates a type of discourse structure (strategy) that would make the response congruent to the current question under discussion. This seems to require prominence, but not a particular accent type (c.f. Calhoun 2006). Moreover, a perception experiment indicates that the presence of an actual rise in these fall-rise contexts is not really important. However, larger pitch gestures do appear change the perceived level of speaker engagement. This (happily!) echoes some of my previous work on prosody/semantics interaction of cue words like `really'.

Sep 20, 2011: General discussion

General discussion - All welcome

Sept 16, 2008: Uptalk and UNB Rises

Mark Lieberman, University of Pennsylvania

In my post "Uptalk anxiety", 9/7/2008, I tried to comfort an American parent who was worried about a daughter's use of rising pitch accents on statements. As part of the recommended cognitive therapy, I observed that there are regional varieties of English, known as "Urban North British", in which rising pitch accents on statements are more common than not. But Bob Ladd, who ought to know, commented that "It's important not to confuse the rises in Belfast, Glasgow, etc. with uptalk. They're phonetically and functionally very different."

Sept 23, 2008: Phonetic and Sociolinguistic Aspects of Diphthongization in Montreal French

and , University of Pennsylvania

Montreal French is known for having a process of diphthongization that affects long vowels: both those vowels that are inherently long and those that are lengthened due to a following lengthening consonant ([R] or one of the voiced fricatives). We examine the acoustic correlates of these diphthongs and situate them with regard to the Montreal French vowel system as a whole. We find that, as previously acknowledged, vowels in closed syllables are lower than vowels in open syllables; diphthongized vowels in lengthening contexts display a lowered nucleus and an offglide at roughly the same location as the non-lengthened closed-syllable vowel.

Focusing on two vowels in particular, lengthened [E] and lengthened [?], we show that lengthening consonants do not affect each vowel identically: any one of the aforementioned lengthening consonants will cause a preceding [E] to diphthongize, while only [R] has that effect on [?] (and thus words such as ?uvre, (heu)reuse display no diphthongization). We also examine the effect of length on diphthongization and find that some speakers show a robust correlation between length and diphthongization: longer vowels tend to display a greater change between nucleus and offglide. Finally, we provide results from a longitudinal analysis of several speakers' diphthongization rates over time, in an examination of speakers' potential to alter their vowel systems over their lifetimes.

Sept 30, 2008: Jackie, Jessie, and Jimmy? Sound Similarity and Cultural Evolution

Jonah Berger, Wharton/Marketing, University of Pennsylvania

Cultural tastes and practices often cycle in and out of popularity. Certain first names, music genres, clothing styles, and catchphrases may be popular one decade only to fade into oblivion in the next. But what factors might drive fluctuations in adoption and why do certain cultural items become popular when they do? In particular, how might the similarity between cultural units influence their success? We investigate this question in the domain of first names. Using over 100 years of historical data on name adoption, we examine how phonetic and other types of similarity between names might influence what becomes popular over time, and when cultural items catch on and die out.

October 7, 2008: Automated Vowel Formant Analysis

, University of Pennsylvania

In this presentation I will compare the results of several different procedures for automatically choosing a point in time during a vowel's duration at which to extract formant measurements. I will present the results of an experiment which compared the F1 and F2 measurements provided by the automated procedures (on the output of forced alignment) with hand measurements for word list data from 26 speakers (160 words per speaker).

October 21, 2008: TBA

, University of Pennsylvania; , University of Pennsylvania

October 28, 2008: TBA

Yanhong Zhang

November 4, 2008: TBA

, University of Pennsylvania

November 18, 2008: TBA

, University of Pennsylvania

November 25, 2008: TBA

Suzanne Van Der Feest, University of Pennsylvania

Apr 29, 2008:

Chandan Narayan, University of Pennsylvania

Apr 22, 2008:

Giang Nguyen, University of Pennsylvania

Apr 15, 2008: Mastering vowel duration: Children show intrinsic durations and post-vocalic consonant voicing cue by age 2;1

Joshua Tauberer, University of Pennsylvania

This will be a relatively short talk on corpus work I've been doing on the acquisition of two aspects vowel timing: intrinsic vowel duration and using vowel duration as a cue to post-vocalic consonant voicing. In both cases, it appears that children learn these things surprisingly (to me) early, by around age 2. But are they learning the phonological rules, or just imitating the timing of adult words? I don't have any answer to that question yet, so feedback on how to approach the problem will be welcome.

Apr 08, 2008:

Catherine Lai, University of Pennsylvania

Apr 01, 2008:

Michael Friesner, University of Pennsylvania

Mar 25, 2008:

Kyle Gorman, University of Pennsylvania

Mar 18, 2008:

Suzanne Van Der Feest, University of Pennsylvania

Mar 04, 2008:

Carolyn Quam, Department of Psychology, University of Pennsylvania

Feb 26, 2008:

Keelan Evanini, University of Pennsylvania

Feb 19, 2008: Length as a contrastive feature in Vietnamese vowels

Giang Nguyen, University of Pennsylvania, abstract

Feb 12, 2008: Goldilocks meets the subset problem: Evaluating Error-Driven Constraint Demotion for OT language acquisition

Joshua Tauberer, University of Pennsylvania