• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

Oct 2007

Volume 122, Issue 4, pp. 1845-EL141

back to top
RSS Feeds

A role for the second subglottal resonance in lexical access

Steven M. Lulich, Asaf Bachrach, and Nicolas Malyska

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2320-2327 (2007); (8 pages) | Cited 2 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Acoustic coupling between the vocal tract and the lower (subglottal) airway results in the introduction of pole-zero pairs corresponding to resonances of the uncoupled lower airway. If the second formant (F2) passes through the second subglottal resonance a discontinuity in amplitude occurs. This work explores the hypothesis that this F2 discontinuity affects how listeners perceive the distinctive feature [back] in transitions from a front vowel (high F2) to a labial stop (low F2). Two versions of the utterances “apter” and “up there” were synthesized with an F2 discontinuity at different locations in the initial VC transition. Subjects heard portions of the utterances with and without the discontinuity, and were asked to identify whether the utterances were real words or not. Results show that the frequency of the F2 discontinuity in an utterance influences the perception of backness in the vowel. Discontinuities of this sort are proposed to play a role in shaping vowel inventories in the world’s languages [ K. N. Stevens, J. Phonetics 17, 3–46 (1989) ]. The results support a model of lexical access in which articulatory-acoustic discontinuities subserve phonological feature identification.
Show PACS
43.71.An Models and theories of speech perception
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech
43.70.Mn Relations between speech production and perception
43.70.Bk Models and theories of speech production

Dynamic spectral structure specifies vowels for children and adults

Susan Nittrouer

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2328-2339 (2007); (12 pages) | Cited 2 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
When it comes to making decisions regarding vowel quality, adults seem to weight dynamic syllable structure more strongly than static structure, although disagreement exists over the nature of the most relevant kind of dynamic structure: spectral change intrinsic to the vowel or structure arising from movements between consonant and vowel constrictions. Results have been even less clear regarding the signal components children use in making vowel judgments. In this experiment, listeners of four different ages (adults, and 3-, 5-, and 7-year-old children) were asked to label stimuli that sounded either like steady-state vowels or like CVC syllables which sometimes had middle sections masked by coughs. Four vowel contrasts were used, crossed for type (front/back or closed/open) and consonant context (strongly or only slightly constraining of vowel tongue position). All listeners recognized vowel quality with high levels of accuracy in all conditions, but children were disproportionately hampered by strong coarticulatory effects when only steady-state formants were available. Results clarified past studies, showing that dynamic structure is critical to vowel perception for all aged listeners, but particularly for young children, and that it is the dynamic structure arising from vocal-tract movement between consonant and vowel constrictions that is most important.
Show PACS
43.71.An Models and theories of speech perception
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech
43.71.Ft Development of speech perception

A study of regressive place assimilation in spontaneous speech and its implications for spoken word recognition

Laura C. Dilley and Mark A. Pitt

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2340-2353 (2007); (14 pages) | Cited 1 time

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Regressive place assimilation is a form of pronunciation variation in which a word-final alveolar sound takes the place of articulation of a following labial or velar sound, as when green boat is pronounced greem boat. How listeners recover the intended word (e.g., green, given greem) has been a major focus of spoken word recognition theories. However, the extent to which this variation occurs in casual, unscripted speech has previously not been reported. Two studies of pronunciation variation were conducted using a spontaneous speech corpus. First, phonetic labeling data were used to identify contexts in which assimilation could occur, namely, when a word-final alveolar stop (/t/, /d/, or /n/) was followed by a velar or labial consonant. Assimilation was indicated relatively infrequently, while deletion, glottalization, or canonical pronunciations were more often indicated. Moreover, lexical frequency was shown to affect pronunciation; high frequency lexical items showed more types of variation. Second, acoustic analyses showed that neither place of articulation cues (indicated by second formant variation) nor relative amplitude was sufficient to distinguish assimilated from deleted and canonical variants; only when closure duration was additionally taken into account were these three variant types distinguishable. Implications for theories of word recognition are discussed.
Show PACS
43.71.An Models and theories of speech perception
43.70.Fq Acoustical correlates of phonetic segments and suprasegmental properties: stress, timing, and intonation
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech

When and why listeners disagree in voice quality assessment tasks

Jody Kreiman, Bruce R. Gerratt, and Mika Ito

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2354-2364 (2007); (11 pages) | Cited 7 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Modeling sources of listener variability in voice quality assessment is the first step in developing reliable, valid protocols for measuring quality, and provides insight into the reasons that listeners disagree in their quality assessments. This study examined the adequacy of one such model by quantifying the contributions of four factors to interrater variability: instability of listeners’ internal standards for different qualities, difficulties isolating individual attributes in voice patterns, scale resolution, and the magnitude of the attribute being measured. One hundred twenty listeners in six experiments assessed vocal quality in tasks that differed in scale resolution, in the presence/absence of comparison stimuli, and in the extent to which the comparison stimuli (if present) matched the target voices. These factors accounted for 84.2% of the variance in the likelihood that listeners would agree exactly in their assessments. Providing listeners with comparison stimuli that matched the target voices doubled the likelihood that they would agree exactly. Listeners also agreed significantly better when assessing quality on continuous versus six-point scales. These results indicate that interrater variability is an issue of task design, not of listener unreliability.
Show PACS
43.71.Bp Perception of voice and talker characteristics

Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners

Diane Kewley-Port, T. Zachary Burkle, and Jae Hee Lee

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2365-2375 (2007); (11 pages) | Cited 18 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
The purpose of this study was to examine the contribution of information provided by vowels versus consonants to sentence intelligibility in young normal-hearing (YNH) and typical elderly hearing-impaired (EHI) listeners. Sentences were presented in three conditions, unaltered or with either the vowels or the consonants replaced with speech shaped noise. Sentences from male and female talkers in the TIMIT database were selected. Baseline performance was established at a 70 dB SPL level using YNH listeners. Subsequently EHI and YNH participants listened at 95 dB SPL. Participants listened to each sentence twice and were asked to repeat the entire sentence after each presentation. Words were scored correct if identified exactly. Average performance for unaltered sentences was greater than 94%. Overall, EHI listeners performed more poorly than YNH listeners. However, vowel-only sentences were always significantly more intelligible than consonant-only sentences, usually by a ratio of 2:1 across groups. In contrast to written English or words spoken in isolation, these results demonstrated that for spoken sentences, vowels carry more information about sentence intelligibility than consonants for both young normal-hearing and elderly hearing-impaired listeners.
Show PACS
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech
43.71.Ky Speech perception by the hearing impaired
43.66.Sr Deafness, audiometry, aging effects

Speech intelligibility in cochlear implant simulations: Effects of carrier type, interfering noise, and subject experience

Nathaniel A. Whitmal, III, Sarah F. Poissant, Richard L. Freyman, and Karen S. Helfer

J. Acoust. Soc. Am. Volume 122, Issue 4, pp. 2376-2388 (2007); (13 pages) | Cited 14 times

Full Text: Read Online (HTML) | Download PDF

Show Abstract
Channel vocoders using either tone or band-limited noise carriers have been used in experiments to simulate cochlear implant processing in normal-hearing listeners. Previous results from these experiments have suggested that the two vocoder types produce speech of nearly equal intelligibility in quiet conditions. The purpose of this study was to further compare the performance of tone and noise-band vocoders in both quiet and noisy listening conditions. In each of four experiments, normal-hearing subjects were better able to identify tone-vocoded sentences and vowel-consonant-vowel syllables than noise-vocoded sentences and syllables, both in quiet and in the presence of either speech-spectrum noise or two-talker babble. An analysis of consonant confusions for listening in both quiet and speech-spectrum noise revealed significantly different error patterns that were related to each vocoder’s ability to produce tone or noise output that accurately reflected the consonant’s manner of articulation. Subject experience was also shown to influence intelligibility. Simulations using a computational model of modulation detection suggest that the noise vocoder’s disadvantage is in part due to the intrinsic temporal fluctuations of its carriers, which can interfere with temporal fluctuations that convey speech recognition cues.
Show PACS
43.71.Es Vowel and consonant perception; perception of words, sentences, and fluent speech
43.71.Ky Speech perception by the hearing impaired
43.66.Ts Auditory prostheses, hearing aids
Close

close