• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

Nov 1980

Volume 68, Issue S1, pp. S1-S116

back to top
RSS Feeds
back to top Session BB. Speech Communication V: Speech Perception III
Contributed Papers
FREE

Faulty timing degrades speech intelligibility (A)

A. W. F. Huggins

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S49-S49 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Studies of deaf speech frequently cite faulty timing as a major cause of poor intelligibility. Yet modifying the timing of normal speech seems to have so little effect on its intelligibility that formal quantification is not worthwhile. Usually, however, the person who modifies the timing also knows what the utterance said, and therefore is not forced to rely on what he or she hears to establish its prosodic pattern. When formal tests are run, sentences that appeared perfectly acceptable to the experimenter suddenly become unintelligible to the subjects. An experiment will be reported comparing the intelligibility of monotone speech synthesized by rule under four conditions: (1) segment durations assigned by rule, (2) equal duration for all segments, (3) equal duration for all syllables, and (4) equal duration for all metric feet.
FREE

Perceived phonetic distance among a set of synthetic whispered vowels and fricative consonants (A)

Timothy J. McManus and Dennis H. Klatt

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S49-S49 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Phonetically trained listeners were presented with pairs of 300‐ms synthetic speech‐like sounds and asked to estimate phonetic distance between members of a pair. The first sound, the reference, was always the same—a voiceless [ae]. A set of 33 other sounds were synthesized by making small changes to formant frequencies, bandwidths, or spectral tile of the reference. Results indicate that phonetic distance judgements are consistent with previous work on voiced vowels [D. H. Klatt, J. Acoust. Soc. Am. 66, S86 (1979)] in showing that small formant frequency changes caused the largest perceived phonetic distances. However, the new results differ in that (1) changes in first formant frequency had little effect (probably because first formant bandwidth increases associated with an open glottis cause a relatively indistinct first formant spectral peak), and (2) spectral tilt changes also influenced distance judgments (probably by turning some stimuli into fricative‐like spectra). In order to check this latter hypothesis, the tape was played to a new group of subjects who were asked to indicate whether each stimulus was a vowel or a fricative. It was found that fricatives are heard when spectral tilt increases, or when third or fourth formant frequency changes result in a high‐frequency energy concentration. [Work supported in part by an NIH grant.]
FREE

Perception of the duration of rapid spectrum changes: Evidence for context effects with speech and nonspeech signals (A)

Thomas D. Carrell, David B. Pisoni, and Susan J. Gans

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S49-S49 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
For a number of years investigators have focused attention on the effects one acoustic segment has on the perception of other acoustic segments. In one study, Miller and Liberman [Percept. Psychophys. 25, 457–465 (1979)] reported that overall syllable duration influences the location of the labeling boundary between the stop /b/ and the semivowel /w/. They claim that this “context effect” reflects a form of perceptual normalization whereby the listener somehow readjusts his perceptual apparatus to take account of the difference in rate of articulation of the talker. More recently, Eimas and Miller [Science (in press)] have reported that prelinguistic infants also show similar context effects in discrimination of these stimuli. The inference drawn from the latter study is that infants perceive these speech sounds like adults and the context effects reflect the operation of perceptual mechanisms that underlie a phonetic‐like mode of processing specific to speech. In the present study, we carried out several critical comparisons between speech and nonspeech signals and observed comparable context effects for perception of the duration of rapid spectrum changes as a function of overall duration of the stimulus. Our results with nonspeech signals falsify both of the earlier claims by demonstrating clearly that these sorts of context effects are not peculiar to the perception of speech signals or to normalization of speaking rate. [Work supported by NIMH Research Grant MH‐24027‐06.]
FREE

Onset spectra versus formant transitions as cues to place of articulation (A)

Amanda C. Walley and Thomas D. Carrell

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S49-S50 (1980); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Stevens and Blumstein [J. Acoust. Soc. Am. 67, 648–662 (1980)] have proposed that the shape of the onset spectrum of a CV syllable provides listeners with a primary and contextually invariant cue to place of stop‐consonant articulation. Contextually variable formant transitions, on the other hand, are claimed to constitute only “secondary” cues to place of articulation that are learned through their co‐occurrence with the primary spectral ones. The present experiment assessed this claim about the relative importance of these two cues by obtaining listeners' identifications of synthetic stimuli whose onset spectra were matched to the Stevens and Blumstein spectral templates, but which contained conflicting transitions cues to place of articulation. The results indicated that listeners use formant transition information more often than the information contained in the stimulus onset—a result which argues against Stevens and Blumstein's claim that the static onset spectrum is the primary cue to place of articulation. However, because some listeners did consistently use the onset spectra cue for identification of place of articulation, it would seem that several perceptual strategies can be used in phoneme identification. [Work supported by NIMH and Research Council of Canada.]
FREE

Some observations on the contribution of formant transitions to the identification of fricative‐vowel and nasal‐vowel syllables (A)

M. F. Dorman and L. J. Raphael

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S50-S50 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In a series of experiments we have assessed the interaction of friction noises and nasal resonances with formant transitions in determining phonetic identification. When the intensity of /s/‐friction in “sink” is reduced to about −15 dB re: vowel intensity, listeners report “think”. Thus, a fricative noise heard as /s/‐like in isolation is heard as /θ/‐like when followed by a vowel of appropriate amplitude. To determine the interval over which transitions exert a domineering influence, we turned to nasal‐vowel syllables and created stimuli with /m/‐resonances of varying duration followed by formant transitions appropriate for alveolar place. At durations up to 100 ms listeners report /na/, while at longer intervals they report /mna/.
FREE

The effects pitch‐distorbed feedback on vocal productions (A)

Jeffrey L. Elman

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S50-S50 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Previous investigations into the effects of distorted auditory feedback on vocalizations have been limited to manipulations of intensity, background noise, or delay in arrival time. The current study reports the results of real‐time distortions of the frequency components of auditory feedback. Subjectively, such distortion is perceived primarily as an alteration in the fundamental frequency of the speech. Subjects performed a variety of tasks under conditions where they received either normal auditory feedback (NAF) or pitch‐altered feedback (PAF). NAF performance was consistently good; PAF performance was quite poor. Subjects attempted to compensate for the frequency alterations by adjusting their fundamental frequency up or down such that the resulting feedback appeared to be “normal.” [Work supported by NSF.]
FREE

Audibility of phase changes in vowel sounds and complex tones (A)

Richard M. Stern and Alexander H. Waibel

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S50-S50 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
This research was motivated by recent observations of a relatively strong perceptual salience of phase relations among harmonics in the vowel sound /ae/ [Carlson et al., J. Acoust. Soc. Am. 65, S6(1979)]. Phase had contributed relatively weakly to timbre in classical psychoacoustical studies using complex tones. We investigated the audibility of phase changes in vowels and complex tones using a forced‐choice paradigm in which subjects discriminated between sounds presented with all harmonics in phase and a similar stimulus with a phase shift for the even harmonics that was varied from 0 to 90 degrees. “Phase discrimination thresholds” were obtained by determining the phase shift needed to produce criterion discrimination performance. Our preliminary findings indicate no major differences between the audibility of phase changes for vowel sounds and complex tones. Its joint dependence on fundamental frequency and number of harmonics is consistent with results in the literature. These findings are also compared to the audibility in vowel sounds of phase shifts as produced by the vocal tract.
FREE

Susceptibility to spread of masking in normal and sensorineural hearing‐impaired listeners (A)

Maureen Hannley

J. Acoust. Soc. Am. Volume 68, Issue S1, pp. S50-S50 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
To determine whether upward spread of masking is responsible for failure to identify stop consonant place of articulation, sensorineural hearing‐impaired subjects grouped according to etiological basis of hearing disorder were tested on a speech and a nonspeech task. Two‐formant consonant ‐ vowel syllables varying along a /b d g/ continuum were presented for identification at moderate and high intensity when formant amplitudes were equal and when F1 amplitude was attenuated by 6, 12, and 18 dB. In the second experiment, noise‐on‐tone masking functions were generated using seven narrow bands of noise at moderate and high intensity. Upward spread of masking could be demonstrated in speech and nonspeech tasks, irrespective of the subjects' age, audiometric configuration, or etiology of hearing impairment. Attenuation of F1 produced varying results on phoneme labelling performance among groups. The outcome of these experiments showed that while listeners with noise‐induced hearing loss showed substantial improvement in identifying place of articulation when F1 was attenuated, upward spread of masking does not consistently account for poor place identification in other types of sensorineural hearing impairment.
Close

close