• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue

Dec 1977

Volume 62, Issue S1, pp. S1-S102

back to top
RSS Feeds
back to top Session HH. Speech Communication VII: Feature Perception and Discrimination
Contributed Papers
FREE

Response bias account of selective adaptation (A)

Jeffrey L. Elman

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S76-S77 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A number of investigators have hypothesized the existence of specialized neurolinguistic mechanisms—feature detectors—as playing a role in the perception of speech. One of the strongest sources of support for feature detectors has come from the selective adaptation paradigm, in which it is supposed that adaptation causes neural fatigue in a feature detector. In this study, a signal detection theory model was developed for phoneme identification tests. Results from identifications of three series of stimuli (/abə‐abə/, /bæ‐dæ/, /bæ‐pæ/) were analyzed using this model to answer two questions: (1) Is increased discriminability of stimuli near the phoneme boundary due to response bias? (2) Does the phoneme boundary shift after adaptation result from criterion movement, or from sensory changes consistent with a feature detector account? Findings indicate that while the phoneme boundary effect does not seem to be due to response bias, selective adaptation is accomplished by a criterion shift, rather than changes at a more basic level of perception. These results suggest that the adaptation effect thus provides no evidence for feature detectors. [Work supported by NSF.]
FREE

Effect of selective adaptation on the discrimination of small differences in voice onset time (VOT) (A)

William F. Ganong, III

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S77-S77 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The effect of selective adaptation (i.e., listening to a rapidly repeated syllable) on the discrimination of voiceless stimuli which differ only in VOT was investigated. In an AX procedure under conditions of minimal stimulus uncertainty, subjects in the unadapted state were able to discriminate (d′=1)small (5 msec) differences in the VOT of stimuli from the voiceless end of a /ba/—/pa/ continuum. Control experiments showed this discrimination was not based on memorization of the aspiration noise waveform, and that subjects could represent small differences in VOT in long‐term memory. Adaptation depressed discrimination slightly, showing that adaptation can affect a perceptual task which does not depend on phonetic categorization. The data were not precise enough to indicate whether adaptation affected a level sensitive to the acoustic similarity of the adapting and test stimuli or to the relationship of the adapting stimulus to the phonetic category.
FREE

Effects of varying total adaptor energy in selective adaptation (A)

Helen J. Simon

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S77-S77 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Two experiments were designed to examine the effects of varying total adaptor energy in selective adaptation along a synthetic stop consonant continuum. Experiment I considered the effect of the number of adaptors presented in the adaptation sequence preceding each test sequence, with repetition rate held constant: Over the range from 8 to 32 adaptors, the magnitude of the phonetic boundary shift was found to be a linear function of the logarithm of the number of adaptors. Experiment II explored the effect of variations in the repetition rate (or density) of adaptors with the number held constant. A nonmonotonic relation was found between the phoneme boundary shift and the interadaptor interval: A greater shift was observed for a 750‐msec interval than for either a 250‐msec interval or a 1750‐msec interval. These low‐level stimulus energy variables (adaptor number and repetition rate) affect the magnitude of the phonetic boundary shift in what may be a trading relationship. [Work supported by NICHD to the Haskins Laboratories.]
FREE

Recovery from selective adaptation to a voiceless VOT stimulus (A)

Donald J. Sharf and R. Ohde

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S77-S77 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Recovery from selective adaptation to a voiceless VOT stimulus was determined by having a group of subjects identify stimuli varying in VOT duration as /p/ or /b/ before adaptation, immediately following each of a series of 1 min adaptation trials, and at 1‐, 4‐, and 7‐min intervals following adaptation. Shifts in phonetic boundary loci (in milliseconds) were obtained between preadaptation and adaptation conditions and between adapatation and postadaptation conditions; recovery was measured in percentage of postadaptation shift. Mean recovery was above 50% in the first minute, above 75% at 4 rain, and tended to level off from 4 to 7 min. By 7 min, more than half the subjects achieved at least 75% recovery and more than three‐quarters achieved at least 50% recovery. The prevailing recovery patterns for subjects reflected either considerable recovery in the first minute with more gradual recovery thereafter or gradual recovery over the entire time interval. No correlation was found between the amount of boundary shift and the degree of recovery for any of the recovery intervals. For the three time conditions, mean pre‐adaptation boundary loci differed by about 1 msec and mean adaptation shifts differed by about 2 msec.
FREE

Levels of decision in the perception of voicing contrasts (A)

D. R. Dechovitz and R. Mandler

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S77-S77 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Various types of acoustic cues have been shown to influence the voiced‐voiceless decision for initial prestressed consonants. In the present study, a selective adaptation method was used to investigate possible feature extraction devices underlying the perception of voicing. Two [da‐tha] continua were constructed, each ranging in voice onset time (VOT) from 0 to 55 msec in 5‐msec steps. The total transition duration of each stimulus was 15 msec in one series and 70 msec in the other. Differential effects were obtained for within‐ and cross‐series adaptation along a VOT dimension by presenting an adapting stimulus from each series with the same absolute VOT(25 msec). The present results support the view that models of voicing perception based strictly on absolute VOT detectors are inadequate. Further, they suggest that a sequence of decisions grounded in cue extraction is the basis for the voicing percept.
FREE

Rapid versus rabid: A catalogue of acoustic features that may cue the distinction (A)

L. Lisker

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S77-S78 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In American English, initial /bdg/ often lack the acoustic feature taken as the defining feature of voiced stops; intervocalically before unstressed vowel /ptk/ lack aspiration, without which initial stops are not labeled /ptk/. Initially, the two categories differ in the timing of vocal fold adduction and onset of fold vibration, and several acoustic cues, all tied to the VOT difference, have been studied. Medially there is also a difference in the management of the larynx, though it results in a phonetically simpler contrast, one of voicing with no accompanying aspiration difference. Acoustically, however, the list of features that play, or might plausibly play a role is quite large. The word pair rapid‐rabid, for example, might be affected by the following: (1) presence/absence of low‐frequency buzz during the closure interval, (2) duration of closure, (3) F1 offset frequency before closure, (4) F1 offset transition duration, (5) F1 onset frequency following closure, (6) F1 onset transition duration, (7) [æ] duration, (8) F1 “cut‐back” before closure, (9) F1 cutback following closure, (10) VOT cutback before closure, (11) VOT delay after closure, (12) F0 contour before closure, (13) F0 contour after closure, (14) amplitude of [i] relative to [æ], (15) decay time of glottal signal preceding closure, (16) intensity of burst following closure. Even if some of these should turn out to be perceptually negligible, enough of them surely have cue value to make it a formidable task to justify preferring an acoustic to an articulatory account of the distinction between the two English words. [The support of the National Institute of Child Health and Human Development is gratefully acknowledged.]
FREE

Some relations between duration of silence and duration of friction noise as joint cues for fricatives, affricates, and stops (A)

A. M. Liberman, Bruno H. Repp, T. Eccardt, and D. Pesetsky

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S78-S78 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
As is well known, introducing a short interval of silence between the words SAY and SHOP causes the listener to hear SAY CHOP. In isolation, SHOP may be turned into CHOP by reducing the duration of the friction noise. Now, varying both cues orthogonally in a sentence context, we find that, within limits, they are perceived in relation to each other: The shorter the duration of the noise, the shorter the silence necessary to convert the fricatives into an affricate. On the other hand, when the rate of articulation of the sentence frame is increased while holding noise duration constant, a longer silent interval is needed to hear an affricate, as if the noise duration, but not the silence duration, were subjectively longer in the faster sentence. In a separate experiment, varying noise and silence durations in GRAY SHIP, we find that, given sufficent silence, listeners report GRAY CHIP when the noise is short, but GREAT SHIP when it is long. Thus, the long noise in the second syllable disposes the listeners to displace the stop to the first syllable, so that they hear not a syllable‐initial affricate (i.e., stop‐initiated fricative) but a syllable‐final stop (followed by a syllable‐initial fricative). [Work supported by NICHD.]
FREE

Role of stop consonants in word recognition (A)

J. Jakimik and R. A. Cole

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S78-S78 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The perceptibility of various phonetic features during word recognition was assessed by having listeners detect systematic mispronunciations of words in a short story. Mispronunciations were produced by changing a single‐consonant segment to produce a nonsense word (e.g., “boy” to “poy”). The rationale behind the experiments was that, in order to perceive a word as mispronounced, the listener must make a phonetic discrimination. Therefore, the listening for mispronunciation task examines the use of phonetic information during the process of recognizing words from fluent speech. The results of six experiments revealed that changes involving prestressed word‐initial‐stop consonants were detected more often than changes involving any other consonant. Thus, voicing changes were better detected in stops than in fricatives (70% versus 38%). Changes in manner of articulation were also better detected in stops than in fricatives (84% versus 53%). Changes in place of articulation in stops were better detected than place changes in nasals (77% versus 64%) and voicing changes in stops were better detected than place changes in nasals (75% versus 69%). Voicing and place changes in stops were consistently well detected (at least 70%) over three speakers differing in speech style and rate. The results showed that phonetic features are more perceptible in stops than in other consonants in fluent speech. The results suggest that stop consonants in word‐initial position provide the most important and reliable phonetic information about a word's identity in fluent speech. They therefore play a special role in the word recognition process.
FREE

Onset spectra and stop consonant recognition (A)

M. F. Dorman and Lawrence J. Raphael

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S78-S78 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In a series of experiments, we presented listeners signals consisting of an onset spectrum appropriate for one place of articulation followed at silent intervals from 0 to 150 msec by transition cues appropriate for a different place of articulation. As Fisher‐Jørgensen [Ann. Rep. Inst. Phonetics (1972)] has reported for a similar transposition experiment, the onset spectrum determines place identification in some vocalic environments but not others. When the onset spectrum does determine place judgments it “overrides” the place of articulation signaled by the transition cues over silent intervals of up to 60 msec. [Research supported by NICH HD‐01994.]
FREE

Some context effects on voice onset time (A)

Gary Weismer

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S78-S78 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Systematic variance in measures of voice onset time (VOT) has been attributed primarily to place of articulation and position of stress relative to the target segment. Although data are available regarding other influences (e.g., rate) on VOT, we are unaware of any information regarding effects of phonetic contexts on this measure of glottal‐supraglottal timing. As a result of post hoc we observations from a previous study [Klee, Weismer, and Ingrisano, J. Acoust. Soc. Am. 60, S63(A) (1976)] concerning context effects on VOT, an experiment was designed in which fluctuations in VOT of the first consonant in CVC target words could be observed as we systematically varied the tense/lax and voicing features of the interconsonantal vowel and final consonant, respectively. Our results indicate that, for the initial consonant in a CVC target word, (1) longer VOT's are associated with tense, as compared with lax, interconsonantal vowels, (2) longer VOT's are associated with voiced, as compared with voiceless, final consonants, and, (3) the interactive effect of interconsonantal vowel and final consonant features on VOT appears to be extremely complex.
FREE

Phonetic context effects on the identification of stop consonants (A)

Randy L. Diehl, Jeffrey L. Elman, and Susan Buchwald

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S78-S79 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Changes in the identification of speech sounds following selective adaptation are usually attributed to a reduction in sensitivity of auditory feature detectors. An alternative explanation of these effects is based on the notion of response contrast. In several experiments, subjects identified the initial segment of synthetic CV syllables as either the voiced stop /b/ or the voiceless stop /p/. Each test syllable had a value of VOT which placed it near the English voiced‐voiceless boundary. When the test syllables were preceded by a single clear /b/ (VOT=−100 msec), subjects tended to identify them as /p/, whereas when they were preceded by an unambiguous /p/ (VOT=+100 msec), the syllables were predominantly labeled /b/. This contrast effect occurred even when the contextual stimuli were velar and the test stimuli were bilabial, suggesting a featural rather than a phonemic basis for the effect. To discount the possibility that these might be instances of single‐trial sensory adaptation, we conducted a similar experiment in which the contextual stimuli followed the test items. Reliable contrast effects were still obtained. In view of these results, it appears likely that response contrast accounts for at least some component of the adaptation effects reported in the literature.
FREE

“Cross talk” between voicing and place cues in initial stops (A)

Bruno H. Repp

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S79 (1977); (1 page) | Cited 1 time

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The dependence of the voicing boundary on place cues was investigated by orthogonally varying the F2 and F3 transition onset frequencies of syllable‐initial stop consonants, as well as their voice onset time (VOT). There is evidence for changes in the voicing boundary tied to the perceived place category, but there are also reliable changes within place categories. The latter are often irregular and subject to large individual differences. A much more consistent finding is the dependency of the place boundaries on VOT: The labial‐alveolar and alveolar‐velar boundaries converge as VOT increases, resulting in a reduction of the size of the alveolar category. This effect appears to be a continuous function of VOT, i.e., it does not directly depend on the perceived voicing category. [Work supported by NICHD.]
FREE

Patterns of response to within‐category acoustic variation in dichotically presented stimuli (A)

Edward Carney and Charles Speaks

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S79 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
As in a previous experiment [Carney and Speaks, J. Acoust. Soc. Am. 59, S7(A) (1976)], two sets of tapes for dichotic listening were prepared from synthetic stop‐vowel stimuli which varied on a voice onset time (VOT) continum. Each set contained the same three voiced syllables (ba, da, ga) in combination with one of two sets of voiceless syllables which differed in VOT by 30 msec from set I to set II. The two sets of voiceless syllables differed only in acoustic structure, not in subjects' identification. Six stimulus‐onset asynchronies were used for each of eight subjects, and, in the present experiment, the syllables were aligned by onset of their consonantal portions. Analysis of errors revealed that subjects used V features to construct “blend” error responses significantly more often in set II (longer VOT) than in set I. Results are discussed with reference to “graded feature detectors” [Miller, Percept. Psychophys. 18, 389–397 (1975)] and an alternative “decision” model. [Work supported by USPHS Grant No. NS‐12125.]
FREE

Phoneme recognition system based on auditory perception (A)

C. L. Searle, J. Z. Jacobson, S. G. Rayment, and D. Dockendorf

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S79 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A phoneme recognizer for stop consonants has been constructed using a system design derived from studies of auditory physiology and psychophysics. The system consists of a case ⅓‐octave filter bank as a reasonable approximation to auditory tuning curves and critical bands, a bank of high speed, wide dynamic range envelope detectors, a logarithmic amplifier, and a digital computer for analysis and display. The detector outputs can be displayed as a type of “spectrum” which we believe is similar in information content to the signals passing along the eighth nerve from the cochlea to the brain. Certain features of these measured “spectra” are then abstracted, such as voice onset time and spectral peaks. The features are chosen to correspond to probable features in the human auditory system, as revealed by psychophysical experiments. These features are then analyzed by methods of statistical decision theory (specifically discriminant analysis) to decide on the most probable phoneme. Examples of running spectra (the equivalent of spectrograms in this analysis method) and results of discrimination trials will be presented.
FREE

Categorical perception of /b/ and /w/ during changes in rate of utterance (A)

F. D. Minifie, P. K. Kuhl, and E. M. Stecher

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S79 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Normal sentences containing the sound /w/ were tape recorded and digitized. A normal and a “fast” token of each sentence was stored in a PDP 11‐10 computer. The “fast” tokens were created via a pitch‐synchronous segmentation program wherein segmentation lines were interactively applied at the zero axis crossing at the beginning of each periodic waveform, and at similar periodic intervals during voiceless speech sounds. Deletion of every third interval resulted in a token of the sentence, increased in rate by one‐third. The test consonants plus following vocalic transitions were similarly segmented (11 segments approximately 9 msec each). A recursively applied deletion program allowed alteration of the slope of the CV transitions in 11 steps in both normal and fast utterances. Listeners randomly heard ten presentations of each slope condition for each test sentence at each rate of utterance. Rapid slope changes were heard by listeners as /b/ and more gradual slope changes as /w/. During changes in rate of utterance, the categorical boundary between /b/ and /w/ was shifted toward a steeper slope. The results of this study are discussed in regard to the temporal relativity of categorical boundaries.
FREE

Acoustic memory effects for consonants with lengthened transitions (A)

R. Mandler and D. Dechovits

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S79 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Differences in acoustic memory effects between consonants and vowels have led to the assumption that transience of the acoustic signal may affect persistence in immediate memory [Crowder, J. Verb. Learn. Verb. Beh. 10, 587–596 (1971); Crowder, J. Exp. Psychol. 98, 14–24 (1973)]. The present study was undertaken to determine whether less transient consonants are more persistent than natural stop consonants. Stop stimuli with lengthened F2 and F3 transitions employed in a perceptual study reported on at the last meeting of this society were utilized. Data resulting from stimulus suffix and recency effect tests indicate that these stimuli give results characteristic of short‐transitioned stops. Thus, transience is not a factor in acoustic storage.
FREE

Room reverberation effects on recognition of some consonant features (A)

S. A. Gelfand

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S79-S80 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The effects of real room reverberation (T=0.75 sec) on speech sound recognition was investigated for young, normal‐hearing subjects. Initial versus final and quiet versus reverberant results were significantly different. Place information was most affected, followed by frication. Stop and nasal information were next affected. Voicing was quite robust; and duration, sibilance, and liquid information were virtually unaffected. Several interesting confusion patterns emerged. Except for the relatively high susceptibility of frication, the effects of reverberation on speech recognition seem somewhat similar to that reported for masking and low‐pass filtering. Reverberation appears to result in temporal smearing of the speech signal, so as to act as a speech‐shaped masking noise. Also, short‐duration events appear lengthened, and seem to be associated with stops being reported as fricatives; and relatively long temporal spreading of the fundamental and lower formants appears to be associated with stops being reported as nasals. Comparison of the results with previous data obtained with artificially induced reverberation revealed some qualitatively and quantitatively different effects.
FREE

Symmetry in the direction of substitution for segmental speech errors (A)

Stefanie Shattuck‐Hufnagel and Dennis H. Klatt

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S80-S80 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A model of the utterance production process is proposed in which segmental substitution errors arise from confusion between two simultaneously available and similarly represented planning segments. This model predicts that each segment will serve equally often as the intended target and as the intrusion in errors. Further, the relative frequency that intended segment type x is replaced by intrusion segment type y should be the same as the relative frequency that y is replaced by x, for all xy of the confusion matrix. These predictions are confirmed by the pattern of substitutions for 23 consonantal segments in two independently collected corpora of 1351 and 1471 errors. A small number of consistent exceptions are discussed in light of a patatalization process that occurs in spoken English. One interesting implication of these observations is that “stronger” segments do not tend to replace “weaker” segments in speech errors, no matter what the definition of strength. (Work supported in part by an NIH grant.]
Close

close