• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

May 1978

Volume 63, Issue S1, pp. S1-S87

back to top
RSS Feeds
back to top Session G. Speech Communication II: Feature Perception and Discrimination
Contributed Papers
FREE

The perception of voice onset time in Polish (A)

Michael Mikoś, Patricia Keating, and Barbara Moslin

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S19-S19 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Polish is traditionally described as using the prevoiced and short‐lag categories to contrast its voiced and voiceless stops. However, Moslin and Keating [J. Acoust. Soc. Am. 62, S27 (A) (1977)] have shown that some speakers of Polish make use of the long‐lag, aspirated voicing category. Preliminary results for six speakers of Polish on a/da/‐/ta/continuum with VOT from −20 to +80 ms indicate that all speakers, regardless of how they produce their apical stops, show a labeling boundary and discrimination peak at about 35 ms. That is, speakers who rarely if ever produce long‐lag stops themselves place their category boundary between the short‐ and long‐lag regions, thus showing a dissociation between production and perception. Further work will vary the test conditions and/or subjects' response categories to determine if a boundary exists between the prevoiced and short‐lag regions of the continuum.
FREE

Discrimination of subphonemic phonetic distinctions (A)

S. L. Donald

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S19-S19 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Seven native speakers of Thai and seven native speakers of English took part in a set of discrimination experiments. The Thai‐speaking subjects took part in discrimination tests of both labial and velar stimuli which varied along voice onset time continua. The English‐speaking subjects took part only in the velar discrimination test. The Thai language makes phonemic distinctions between voiced and voiceless unaspirated stops at the labial and dental places of articulation. However, Thai does not make a distinction between voiced and voiceless unaspirated velars. English does not make this distinction at any place of articulation. Therefore, the Thai velar discrimination functions can be compared both with discrimination functions in which a phonemic distinction is made between voiced and voiceless unaspirated stimuli and also with the English‐speaking subjects' discrimination functions where no such phonemic distinction exists in the language. The English‐speaking subjects' discrimination functions are characterized by a single peak spanning the phoneme boundary. The Thai‐speaking subjects' labial discrimination functions are characterized by a large peak spanning each phoneme boundary. Their velar discrimination functions are characterized by a large peak spanning the phoneme boundary and a smaller peak spanning the subphonemic but systematically relevant phonetic boundary.
FREE

Locua of adaptation effects for the voicing feature (A)

Nancy Niccum and Charles Speaks

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S19-S19 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Eight listeners participated in two experiments. The first tested for simultaneous dichotic adaptation by obtaining monaural identification functions before and after adaptation with /ba/ and /pa/ presented dichotically. The difference between the number of/ba/responses for left‐ and right‐ear postadaptation identification was significant. The second experiment assessed the magnitude of interaural transfer. Adaptation and identification stimuli were presented monaurally and identification shifts for the unadapted ear were compared with those for the adapted ear. When /ba/ was the adapting stimulus, only 62%–65% transfer was obtained, but the difference between ears was nonsignificant. When /pa/ was the adapting stimulus, 51%–52% transfer was obtained and the difference between ears was significant. The failure to each significance for the /ba/ conditions may relate to the smaller absolute size of shifts in identification. In general, the results of both experiments provided evidence for a bilateral component, either peripheral or central, to the adaptation effects for voicing. These findings are contradictory to those reported by Eimas et al. [Percept. Psychophys. 13, 247–252 (1973)]. [Supported by PHS NS‐12125.]
FREE

Learning and generalization of intraphonemic VOT discrimination (A)

Thomas R. Edman, Sigfrid D. Soil, and Gregory P. Widin

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S19-S19 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A recent study [A. E. Carney, G. P. Widin, and N. F. Viemeister, J. Acoust. Soc. Am. 62, 961–970 (1977)] has shown that with appropriate training listeners can consistently discriminate intraphonemic differences in a synthetic bilabial VOT stop consonant series. We examined whether listeners trained to discriminate intraphonemic differences on one VOT series can also make intraphonemic discriminations on a different VOT series. Identification and discrimination data were initially obtained from a group of six listeners for both a bilabial and a velar VOT stop consonant series. Half the listeners then received discrimination training on the bilabial series and half received training on the velar series. Discrimination and identification data were subsequently obtained from all listeners for both series. Improved final discrimination on the nontraining series indicates that listeners had learned to discriminate VOT per se, while a failure to improve discrimination on the nontraining series indicates that specific properties of the training series had been learned. [Supported by NIMH, NICHD.]
FREE

Right‐ear advantage for voicing in subjects left‐ear dominant for pure tones (A)

Pierre L. Divenyi and Robert Efron

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S19-S20 (1978); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
It has been reported [Efron and Yund, J. Acoust. Soc. Am. 59, 889–898 (1976)] that the relative salience of the two pitch components of a dichotically presented pair of tones is seldom equal: The pitch mixture is dominated by the right‐ear tone for some listeners and by the left‐ear tone for an equal number of other listeners. This spectrally related perceptual asymmetry was called ear dominance (ED) and was found to be independent of the well‐known right‐ear advantage (REA) observed for dichotically presented speech sounds. The present experiments attempted to isolate those components of REA which may derive from ED. Five subjects, left‐ear dominant for tones, were tested for their ear advantage for pairs of dichotic speech sounds in a 2AFC paradigm. The synthesized stimuli were vowel pairs (differing in tongue height and/or front‐back position) or CV pairs (differing either in voicing or in place of articulation). One subject displayed a left‐ear advantage for all stimuli. The four other subjects retained their left ED for all features that represent only a spectral difference (vowel features and place of articulation), whereas they acquired a REA for voicing, i.e, for the feature which is primarily temporal. [Supported by the VA.]
FREE

Identification of apical stops as a function of instructions to perceive a preceding stimulus as either a speech or nonspeech sound (A)

Debra A. Moroff and J. F. Curtis

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
This study investigated the influence of an ambiguous stimulus on the identification of a series of synthetically produced syllables having voice onset time and aspiration characteristics which varied between those appropriate for /da/ and /ta/. One group of 12 subjects was instructed that a 70‐ms high‐frequency noise burst which preceded the syllable by 40 ms represented the consonant /s/. A second group of 12 subjects was told that the noise burst represented a nonspeech noise. A third control group received no biasing instructions. Subjects instructed to perceive the ambiguous stimulus as /s/ identified the following stop as /t/ more often than those in the nonspeech noise group. Mean reaction time for identification of /t/ was longer than for /d/ in the context of the noise burst for both the speech and control groups, but not for the noise group. The results suggest that phoneme categorization is not based on acoustic properties alone. Consequences for a feature‐detector model of speech perception are discussed.
FREE

A word advantage in phoneme boundary experiments (A)

William F. Ganong, III

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
This study investigated one aspect of the way people integrate acoustic information derived from the speech signal with linguistic knowledge. Acoustic continua between words and nonwords were constructed, and subjects' phonetic categorizations of the stimuli were examined for a bias toward phonetic renderings which make words. Seven pairs of voice onset time continua were synthesized. For one continuum of each pair, the voiced, but not the voiceless end of the continuum was a word (e.g., “gift‐kift”), and vice versa for the other continuum (e.g.) “giss‐kiss”). First, these stimuli were presented in random order for phonetic categorization of the first segment of each stimulus. Those subjects who correctly identified the endpoints of the continua in a second whole‐syllable identification condition showed, in the first condition, a clear and consistent bias toward phonetic categorizations which made words. All 15 of the subjects and all seven of the continua showed this bias. More detailed examination of the phonetic identification curves shows that this bias must arise before phonetic categorization.
FREE

On buzzing the English /b/ (A)

L. Lisker

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The status of closure voicing as a necessary and/or sufficient property of the /bdg/ phonemes of American English has not yet been conclusively determined. Initially it is well established that glottal pulsing before release is not an essential property of the class; in fact it is quite normal for /bdg/ in this context to be voiceless stops, technically speaking. Medially and finally, too, the role of glottal pulsing during closure is not entirely clear, perhaps because discussion has more often centered on the role of closure duration and the duration of a preceding vowel as determinants of stop labeling behavior. Evidence from experiments in the perception of edited natural speech indicates that the presence of closure buzz is a strong cue to /bdg/ in medial position, but that its absence does not invariably trigger “ptk” responses. For a word token in which presence/absence of closure buzz produced a shift in phoneme labeling, the effect of varying the intensity and within‐closure duration of the buzz was determined. Results suggest that closure buzz must be attenuated at least 12 dB for it to be no longer a decisive cue to /bdg/, but that at a naturally produced intensity it may fill as much as two‐thirds the duration of a long closure (140 ms) without eliciting predominantly "bdg" responses. [Work supported by NICHD.]
FREE

Effects of word‐internal versus word‐external tempo on the voicing boundary for medici stop closure (A)

Robert F. Port

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
It is known that the perceived voicing of the medial stop in words like rabid and rapid can be controlled by changing the duration of the stop closure. We found [R. Port, J. Acous. Soc. Am. 59, S41–S42(A) (1976)] that increasing the tempo of a preceding carrier sentence shortened the boundary between rabid and rapid along a continuum of silent closure durations, but by far less than the percent decrease in sentence duration. We hypothesized that timing in unaltered portions of the test word reduced the effect of the carrier tempo. This experiment directly compares the effect on this boundary of the tempo of a surrounding carrier when the tempo of the test word itself is changed. A speaker recorded “I'm trying to say rabid to you” at both fast and slow tempos. Rabid was excised from each, and two continua of silent /b/ closures were prepared (from 50 to 200 ms) and embedded in both original sentences. Listeners identified the test words as either rabid or rapid. Results indicate (1) tempo within the test word had a stronger effect on the boundary than tempo in the carrier, and (2) for a given tempo of carrier, the ratio of the boundary value of closure duration to the duration of the rab syllable was nearly constant for both rab durations.
FREE

Some experiments on the recognition of voiced stop consonants in CV and VCV syllables (A)

Lawrence J. Raphael and Michael F. Dorman

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
When listeners are asked to identify CVs in which the burst and transitions signal different places of articulation, the majority of responses are determined by the transition cues. In the present set of experiments we focus on those situations in which the burst dominates in perception. Even in these situations, we have found that the burst cue can be “overridden” by transition cues if VCVs are created in which the implosive and explosive transitions cue the same place. We infer from this outcome that the decision about place of articulation is not made solely on the basis of the shape of the burst (onset) spectrum, but, rather, is made on the basis of spectral information which both precedes and follows the onset.
FREE

Influence of spectral and temporal properties of vocalic environment on silence as a cue distinguishing single intervocalic stop consonants from stop clusters and geminates (A)

Bruno H. Repp

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S20-S20 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The amount of intervocalic silence needed to perceive two different stop consonants in stimuli of the type /V1b−gV2/(type 1 boundary), and two identical stops in stimuli of the type /V1b−bV2/ (type 2 boundary), was determined as a function of different vowel contexts (V1, V2 = /a/, /i/, /u/) and of different durations of the initial and final vowels (120, 180, and 240 ms). It was predicted that changes in the vowels, and thus in the extent of the formant transitions into and/or out of the closure period, would affect type 1 boundaries more than type 2 boundaries. On the other hand, changes in vowel duration, which effectively change the perceived speaking rate, were expected to affect type 2 boundaries more than type 1 boundaries. Preliminary data tend to confirm these hypotheses, and thus support the view that type 1 and type 2 boundaries reflect temporal integration at different levels of processing. [Work supported by NICHD.]
FREE

Formant transition place cues of intervocalic fricatives (A)

George P. McCasland

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S21-S21 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The noise of intervocalic /f,θ,s,ʃ/ in recorded spoken words having stressed initial vowels of similar intensity was replaced by splicing‐in /s/ or /ʃ/ noise at various intensities equal to and below word level. Low‐level noise, a segment of blank tape, yields stop impressions. In some words, the place cue of the transitions as evidenced by the stop differs from the place impression of the fricative: husi, pithy, sissie yield /p/; mushy, fishy yield /t/. At −24‐dB medium strength or stronger, /s/ noise overrides the labial transition cues, and /ʃ/ noise overrides the alveolar transition cues, producing /s/ and /ʃ/ impressions, respectively. In another group of words, the transition place cue agrees with the place impression of the natural fricative: husu, lucite, hasa yield /t/; huffy yields /p/. These transition cues are 24 dB stronger in tradeoffs against conflicting noise cues than those of the first group of words: The alveolar transition cues override O‐dB word level strength /ʃ/ noise, producing an /s/ impression; the labial cue overrides the same level /s/ noise, producing an /f/ impression.
FREE

Some observations on how the perception of syllable‐initial [b] versus [w] is affected by the remainder of the syllable (A)

Joanne L. Miller and Alvin M. Liberman

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S21-S21 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Given a distinction between [ba] and [wa] in absolute syllable‐initial position cued by duration of formant transitions, we ask how the distinction is affected by that which occurs later in the syllable. In one experiment, we varied the duration of the steady‐state formants that followed the initial transitions, producing syllables with overall durations in the range of 100–370 ms. The observed consequence was a shift of about 20 ms in the duration of formant transitions required to convert [ba] to [wa]: The longer the syllable, the longer the formant transition necessary to perceive [wa]. In a second experiment, we attached formant transitions to the end of the syllable in such a way as to produce the syllable‐final stop [d], hence the syllables [bad] or [wad]. This maneuver had an effect precisely opposite to that we had obtained when we lengthened the syllable by adding steady‐state formants. That is, adding formant transitions to the end of the syllable, thus creating [bad‐wad], had the same effect on the perception of syllable‐initial [b‐w] as subtracting a certain duration of steady‐state formants from [ba‐wa]. We suppose that in these cases the subsequent information in the syllable specifies the rate at which the speaker is talking, and we take the effect on syllable‐initial [b‐w] to be the listener's adjustment to that rate. Certain aspects of these data are in accord with findings of Minifie, Kuhl, and Stecher [J. Acoust. Soc. Am. 62, S15(A) (1977)] and Summerfield (in preparation). [Work supported by NIH.]
FREE

On the perception of nasal consonants (A)

Janette Henderson

J. Acoust. Soc. Am. Volume 63, Issue S1, pp. S21-S21 (1978); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The problem of perceptual constancy was investigated with respect to the variation in acoustic cues for the place of articulation of nasal consonants in English as a function of the preceding vowel. The acoustic analysis of /Vm, Vn, Vŋ/ syllables indicate three potential cues: (1) transition, (2) murmur, and (3) final vocalic release. The relative effectiveness of these cues was assessed by removing each in turn from /m, n, ŋ/ spoken after seven different vowels by a male native speaker of English. The syllables were digitized and edited by computer and the different versions of each presented to subjects for identification. The results indicate that not only did the percentage of overall correct responses vary depending on which cue had been removed, but also that the substitution errors fell into well‐defined categories. [Work supported by NICHD.]
Close

close