• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

May 1990

Volume 87, Issue S1, pp. S1-S164

back to top
RSS Feeds
back to top Session S. Speech Communication III: Tones/Sinusoids for Speech or Pitch Perception
Contributed Papers
FREE

Information for Mandarin tones in the amplitude contour and in brief segments (A)

Yi Xu and D. H. Whalen

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S46-S46 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
The four tones of Mandarin Chinese are primarily realized by changes in fundamental frequency (F0) but duration and amplitude vary systematically as well. Whether the amplitude contour was sufficient for tone perception was examined by using signal‐correlated noise stimuli (which retain only duration and amplitude) of the tones in /i/, each combination resulting in a word. Five versions of each tone were used, with both typical and atypical durations. Duration did not greatly affect the judgments, while the tones with similar amplitude contours were mostly heard either correctly or as the other tone (1 and 4 flat or with early peaks; 2 and 3 with late peaks). Additionally, the time course of the F0 information was examined by presenting brief segments (40, 60, 80, and 100 ms) from several locations in /i/. The level F0 of tone 1, and segments that happened to have a level F0, primarily elicited tone 1 judgments. Most segments with movement in F0 were heard correctly, though tone 3 often elicited tone 2 judgments. This confirms the importance of F0 change and allows us to estimate where in the syllable that change becomes perceptible. [Work supported by NIH Grant HD‐01994.]
FREE

Fundamental frequency and pitch: A perceptual study of Mandarin tonal coarticulation (A)

Xiaonan Susan Shen and Maocan Lin

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S46-S46 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
Rose [Pacific Ling. 55–82 (1989)] claimed that there is a nonequivalence between F0 and pitch, because native listeners were unable to perceive the tonal transient caused by tonal coarticulation. To test his claim, the stimulus consisted of 48 middle tones in Mandarin tritonemes: Six syllables [ma, na, la, ai, ao, aŋ] of the high‐level tone preceded by the high‐level tone (L tones) and the falling tone (R tones) and followed solely by the high‐level tone over the syllable [pa] were spoken by two native Mandarin speakers, cross sliced by computer, and copied twice into a test tape in random order. The 24 R tones had F0 increments beyond the perceptual threshold while the 24 L tones did not. Sixteen Chinese and ten American listeners judged, through binary forced choice, whether the test tones were rising or level. The responses of the Chinese (z = 2.82, p < 0.01) and American listeners (z = 1.91,p < 0.05) were at 83.85% and 71.25% correct levels, respectively. However, Americans were less sensitive than Chinese listeners [t(24) = 10.25, p < 0.001]. The results refute Rose's claim. [Work supported by TAMU.]
FREE

Integration of segmental and tonal information in speech perception: A cross‐linguistic study (A)

Bruno H. Repp and Hwei‐Bing Lin

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S46-S46 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
For speakers of a tone language, a close functional association exists between segmental structure and F0 contour (i.e., tone) in speech because both dimensions are needed to identify words. Using the speeded classification paradigm, which does not require lexical access, the hypothesis that segmental and tonal dimensions are perceptually more strongly integrated for speakers of a tone language (Mandarin Chinese) than for speakers of a nontone language (English) was examined. In four classification tasks, requiring attention to one dimension (either segmental or tonal) of CV syllables while ignoring the other, both subject groups showed strong interference from orthogonal variation in the unattended dimension. The Chinese subjects showed significantly more interference than the English subjects in only one of the four tasks (vowel classification with irrelevant tonal variation). These findings thus provide only weak evidence of differences between Chinese and English speakers in the perceptual integrality of segments and tones. [Work supported by NICHD.]
FREE

High tone raising in Yorùbá (A)

Yétúndé Laniran

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S46-S46 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
This paper will discuss factors that influence the assignment of pitch values to the H(igh) tone in Yorùbá declarative sentences. Two sources for tone raising, high raising and focus/prominence, are discussed. High raising is the phenomenon by which a H tone is raised by either a following or a preceding L tone, or both. Furthermore, the amount by which the H tone is raised depends on its position in the sentence. One of the factors that determines the pitch of the H tone is a phenomenon termed the “startup effect,” which causes tones in utterance‐initial positions to be realized at a higher pitch than similar tones later in the utterance. The other factor is tone type, that is, a preceding L tone increases the pitch of the following H tone, but a M tone causes the pitch of a following H tone not to be as high as when the H tone is preceded by a L. The effect of a boundary L% tone on a following H tone will also be discussed. Implications of tone raising on the perception of the tones by speakers of the language will also be discussed. For example, H‐tone raising (in the various contexts discussed above) seems to be one of the strategies employed in Yorùbá to distinguish between LH and LM sequences.
FREE

Context effects as a function of perceptual set (A)

James V. Ralston and Keith Johnson

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S46-S47 (1990); (2 pages)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
There has been continuing debate regarding claims that human listeners process tonal stimuli differently as a function of perceptual set. To address this issue, the identification of sinusoidal stimuli in subjects reporting either speech or nonspeech percepts was examined. One set of stimuli were constructed from sinusoids modeled after the formants present in a [wi‐ju] continuum. Subjects reliably classified stimuli with two response categories, regardless of their perceptual set. Probit analysis revealed steeper slope for speech labeling functions. Item‐by‐item analysis revealed larger effects of stimulus context for nonspeech listeners. Finally, increases in response latencies were larger for speech listeners for stimuli near the labeling category boundary. The major trends observed with the first set of stimuli were also obtained with smaller magnitudes with a second, simpler set of stimuli ([æ‐uh]). Taken together, these results suggest that speech listeners process sinusoidal stimuli in a qualitatively different manner than nonspeech listeners. Speech listeners appear to weight stable internal criteria more heavily than nonspeech listeners, who weight stimulus context more heavily.
FREE

Preliminary experiments manipulating elementary waveform parameters (A)

Lori Lamel and Maxine Eskenazi

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S47-S47 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
Recently short‐time waveform analysis has been used to analyze, manipulate, and synthesize speech [e.g., Lienard, ICASSP‐87]. Each waveform is described by six parameters: envelope attack and decay, reference instant, energy, internal frequency, and phase. In order to better understand the parameters, and to determine which carry pitch information in waveform‐analyzed speech, perceptual experiments were performed using synthetic stimuli of 300 ms (formed by repeating 10‐ms waveforms). The first experiments, with ten normal‐hearing subjects, verified that the parameters affecting the perceived pitch were internal frequency, offset (repetition interval), and change in phase in successive waveforms; the remaining parameters primarily affected timbre. Next, the parameters found to affect pitch were varied individually and jointly to explore their interaction. These experiments showed that both frequency and offset could dominate perceived pitch. In the region explored, 250–500 Hz (ABX test), when either frequency or offset was 500 Hz, variation of the other parameter did not change the response from 500 Hz. However, when either parameter was fixed at 250 Hz, varying the other changed the pitch percept. Varying the two parameters together, the response changed from 250–500 Hz at the offset frequency closest to the perceptual midpoint between the two references, even though the internal frequency was higher. Finally, the parameters were varied in opposition in order to determine whether a particular parameter dominates.
FREE

On the relationship between vocal sound‐pressure level and the magnitude of jitter and shimmer in the normal voice (A)

Robert F. Orlikoff and Joel C. Kahane

J. Acoust. Soc. Am. Volume 87, Issue S1, pp. S47-S47 (1990); (1 page)

Online Publication Date: 13 Aug 2005

Full Text: | Download PDF

Show Abstract
The relationship between vocal sound‐pressure level (SPL) and the magnitude of cycle‐to‐cycle fundamental frequency (jitter) and amplitude (shimmer) perturbation measured in the voice was examined. Ten normal adult men prolonged the vowel /ɑ/ in a modal register phonation maintained within three SPL ranges: 60–68 dB (“soft”), 70–78 dB (“moderate”), and 80–88 dB (“loud”). Statistically significant differences were found between measured mean and percent jitter and shimmer and the sound‐pressure level of the subjects' voices, such that the degree of perturbation was inversely related to the acoustic amplitude of the vowel. These results indicate that mean phonatory frequency and intensity must be controlled or accounted for to maintain the validity and reliability of voice perturbation measurement, especially when such measures are intended as aids to the detection and discrimination of vocal pathology.
Close

close