• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue

Dec 1977

Volume 62, Issue S1, pp. S1-S102

back to top
RSS Feeds
back to top Session OO. Speech Communication VIII: Infant Perception: Vowel Perception in Adults
Contributed Papers
FREE

Speech perception in early infancy: discrimination of fricative consonants (A)

Tristan L. Holmberg, Kathleen A. Morgan, and Patricia K. Kuhl

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S99-S99 (1977); (1 page) | Cited 2 times

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Recent data strongly suggest that six‐month‐olds perceive a similarity between vowels produced by different sized vocal tracts [P. K. Kuhl, J. Acoust. Soc. Am. 61, S39(A) (1977)]. This study investigates the extent to which infants recognize phonetic similarity for a fricative consonant when it occurs in different vowel environments and is spoken by different talkers. Six‐month‐olds were tested in a discrimination task in which fricative contrasts (/f/ versus /θ/;/s/ versus /ʃ/) were examined in both the initial and final positions of naturally produced monosyllables. Infants were tested in a task in which a head turn was reinforced with a visual stimulus in the presence of an exemplar from one consonant category but not in the presence of an exemplar from the second consonant category. Infants were initially trained to differentiate a single exemplar from each of the two categories; then, the number of exemplars in each of the two categories was systematically increased until each category contained 12 different tokens (4 talkers × 3 vowel contexts). Results suggest that infants are capable of recognizing phonetic similarity for consonant categories. A video tape of the testing will be shown. [Research supported by NICHD.]
FREE

Perception of glides in multisyllabic utterances by infants (A)

P. W. Jusczyk, H. C. Copan, and E. J. Thompson

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S99-S99 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Several aspects of the two‐month‐olds' perception of a phonetic contrast between glides in multisyllabic utterances were explored with the high amplitude sucking paradigm. First, do infants perceive contrasts between glides in the initial (e.g., YADA versus WADA) and medial positions (e.g., DAYA versus DAWA) of multisyllabic utterances? Second, are infants more likely to perceive these contrasts between stressed or unstressed syllables: Our results suggest the following: (1) two‐month olds are sensitive to place‐of‐articulation differences between glides in both the initial and medial positions of multisyllabic utterances; (2) positioning stress on the syllables to be discriminated had little or no effect on the infants' ability to perceive the contrast; and (3) there was no indication that infants were any more sensitive to the contrast in the initial position than in the medial position. Our findings were in complete agreement with those observed previously for infant's perception of stop consonant differences in multisyllabic utterances [Jusczyk and Thompson, Soc. Res. Child Dry. Abstr. (1977)], suggesting that infant's detection of medial contrasts does not depend on the presence of abrupt formant transitions.
FREE

Perceptual analysis of speech sounds by prelinguistic infants: A first report (A)

R. N. Aslin, A. J. Perey, E. Hennesy, and D. B. Pisoni

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S99-S99 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In this paper we report the results of an investigation that employed a two‐alternative forced‐choice head‐turning paradigm to study speech perception abilities in 5–6‐month‐old infants. Two synthetically produced speech sounds were presented as training stimuli during initial shaping and conditioning phases and the infant was reinforced with the presentation of a visual stimulus (i.e., an animated toy monkey) for an appropriate differential head‐turn response toward the right for one stimulus (S1) or toward the left for another stimulus (S2). Successful discrimination training was followed by a transfer phase in which generalization to a variety of novel speech sounds was measured by observing the direction of the infant's head turns. Implications of our findings and methodology for questions surrounding perceptual constancy, feature analysis and the role of early environmental experience in development of speech perception abilities in young infants will be discussed. [This work was supported by research grants from NIMH and NIH.]
FREE

Perception of voice‐onset time by Spanish‐learning infants (A)

Rebecca E. Eilers

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S99-S99 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Individual data were obtained from normal six‐month‐old Spanish‐learning infants on four pairs of synthetic labial stops differentiable on the basis of voice onset time. All infants were presented with the pairs [pha] (+70) versus [pha] (+40), [pha] (+40) versus [pa] (+10), [pa] (+10) versus [ba] (−20), and [ba] (−20) versus [ba] (−50). Discrimination was assessed for each infant on each pair using the VRISD (visually reinforced infant speech discrimination) paradigm. Spanish‐speaking adult discrimination of these pairs was also assessed. Infants presented two main patterns of discrimination: some discriminated either both the pairs +40 versus +10 and +10 versus −20 and failed to discriminate the other two stimulus pairs, and others discriminated only the pair +40 versus +10. The data will be discussed in terms of models of development of speech perception. [Work supported by NICHD HD09906‐03.]
FREE

Phonetic aspects of “devoiced” stop consonants in children's speech (A)

Bruce L. Smith

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S99-S100 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Word‐final “devoiced” obstruents have frequently been observed in children's speech, but very little descriptive information is available regarding acoustic phonetic properties of such consonants. In order to study developmental characteristics of voicing control during “voiced” and “voiceless” stop closure, five English‐speaking 2½–3 year olds, five 4–4½ year olds, and five adults recorded productions of the nonsense utterances: /bab, bábab, babáb, dad, dádad, dadád, tat, tátat, tatát/. Measurements were made from oscillographic traces, and the data were analyzed in terms of (1) the percentage of all stop productions evidencing “devoicing” during closure, and (2) in the case of devoiced stops, the proportion of closure duration evidencing voicing. Both groups of children revealed more frequent occurrence of devoicing than the adults, and the voiced proportion of devoiced stops was also less for the children than for the adults. It was also observed, however, that the children′s “devoiced” stops revealed more voicing than their “voiceless” stop productions. Results are discussed in terms of aerodynamic principles of speech production.
FREE

Judgments of speech maturity: correlations with chronological age and acoustic measurements (A)

C. S. Hawkins and G. D. Allen

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S100-S100 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
An investigation of word‐initial consonant cluster durations spoken by six children aged from 4 to 8 revealed only a weak association between maturity of production and chronological age (CA). This experiment was designed to see whether the children's speech maturity as judged by adults corresponded more closely to the instrumentally measured maturity of their consonant durations than either measure corresponded to CA. Elicited English monosyllables recorded during two periods one year apart were presented so that each child was paired with himself across the two years, and with other children within the same year. Using a definition of speech maturity emphasizing gestural integration and fluency, with less weight on phonemic adequacy and none on obviously age‐related factors such as pitch and voice quality, 48 naive listeners judged which child of each pair spoke most maturely. Judgments of speech maturity correlated significantly both with predictions of the children's relative maturity derived from the similarity of their measured to adult norms, and with these predictions modified by impressions of the children's spontaneous speech. Neither the judgements nor the predictions correlated significantly with CA. Speech matured at very different rates for individual children, and actually decreased over the year for one.
FREE

Auditory perception, phonics and reading disabilities in children (A)

Paula Tallal

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S100-S100 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In order to investigate the hypothesis that auditory perceptual disabilities may result in difficulty in learning phonics skills and hence in learning to read, reading impaired and normally developing children were given a battery of nonverbal auditory perceptual tests as well as a test of phonic skills. The auditory perceptual test battery examined (1) discrimination, (2) temporal order perception, (3) short‐term memory, and (4) serial memory. Stimulus tones were presented at various rates. There were no significant differences between groups on any of the subtests in which stimuli were presented at slow rates. However, when the same stimuli were presented at slow rates. However, when the same stimuli were presented more rapidly, the reading impaired group made significantly more errors than the controls. The reading paired children's ability to use phonics skills (read nonsense words) was also examined. There was a high correlation between the number of errors made on the phonics reading test and the number of errors made in responding to the rapidly presented non‐verbal stimuli in the auditory perceptual tests. The hypothesis, that some reading impairments are related to low level auditory perceptual dysfunction similar to that demonstrated previously for language impaired children, [P. Tallal and M. Piercy, Neuropsychologia 11, 389–398, 1973] is discussed.
FREE

Individual differences in vowel recognition (A)

Steven Greenberg

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S100-S100 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Dorman et al. [Haskins Laboratories, Status Report on Speech Research SR 37/38 (1974)] observed that their experimental population could be divided into two groups on the basis of performance in a backward recognition masking paradigm. The experiments reported here attempt to resolve whether the pattern of individual differences originally reported by Dorman and colleagues derives principally from variability in (1) the rate of access to short‐term verbal memory, or (2) the precision of acoustic‐phonetic discrimination. Stimuli for Experiment I consisted of three synthetically generated, 24‐msec vowels ([I], [ɛ], [æ]) followed at a variable interval (0–320 msec) by a vowel‐like [εΓ] mask. Response patterns of individual subjects segregated into one of two discrete types, consistent with Dorman et al.′s “masker”, “nonmasker” dichotomy. However, experiment II, using four target stimuli ([I], [ɛ], [æ], [a]) and two masks ([εΓ], [o]) witnessed a significant narrowing of the performance gap between the “masker” and “nonmasker” groups, raising certain doubts about the role played by short‐term verbal memory processes in the differentiation of “maskers” and “nonmaskers.” [Research supported by NSF and USPHS.]
FREE

Transconsonantal context effects on vowel recognition latency (A)

James G. Martin

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S100-S100 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The stimuli were spoken vowel‐consonant‐vowel sequences in the frame “I say V1 C V2” in which V1 was /u, æ/, C was /p,t,k/and V2 was /a,i/ in all 12 combinations. Reaction time (RT) of practiced listeners was observed to the assigned final stressed vowel target (V2). Experimental manipulations were: (a) pre‐stop‐consonant silence either was or was not extended by 100–200 msec and (b) the sequence either was or was not cross spliced at the silent interval. Extended silence alters the acoustic‐phonetic structure of the stimulus, and it possibly allows extra time to process appropriate anticipatory cues (if any), or inappropriate anticipatory cues (if any) when cross spliced, to the target vowel. Among the results discussed are both relatively faster and slower RT which were produced by these manipulations and their interactions with target. [Supported by ARIBSS.]
FREE

Anchoring effects and vowel discrimination (A)

J. R. Sawusch and H. C. Nusbaum

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S100-S101 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A series of synthetic, steady‐state vowels varying perceptually from [i] to [I] was used in both identification and discrimination tasks under two conditions. In the control identification condition, each stimulus occurred equally often. Similarly, in the control ABX discrimination condition, each 1‐step ABX trail occurred equally often. in the identification anchor condition, one of the endpoint stimuli occurred more often then the other stimuli. A similar change was made in the ABX anchor condition. In the identification anchor conditions, the category boundary between [i] and [I] shifted toward the more frequently occurring endpoint (anchor), relative to the equiprobable control. The peak ABX discrimination also shifted with anchoring toward the more frequently occurring stimulus (anchor). This change in discrimination performance rules out a response bias interpretation of the anchoring effect in identification. Rather, perceptual factors involving auditory memory or the reevaluation of vowel prototypes may be involved in the anchoring effects and recently reported selective adaptation effects for steady‐state vowels. [Research supported by SUNY Research Foundation and University Funds grants.]
FREE

Perceptual assessment of vowel duration in consonantal context and its application to vowel identification (A)

Paul Mermelstein, A. M. Liberman, and A. Fowler

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S101 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A consonant‐vowel (CV) transition not only allows the listener to identify the consonant but it also affects the perception of the vowel. The transition influences both the perceived duration of the vowel and also its phonetic identity. Are these independent processes, or can the effect on phonetic identity be predicted from the duration effect? Synthetic vowels identifiable as /ɛ/ or /æ/ were preceded by 48 msec long /bV/ transitions incorporating linear variations in the format frequencies. In judging the duration of the vowel, listeners assigned to the vowel 27 msec of a 48‐msec‐long transition; the other 21 msec were presumably assigned to the consonant. As for the effect on the phonetic identity of the vowel, five of seven listeners showed a shift in the duration boundary that was essentially equal to the perceived increment in vowel duration contributed by the transition. Thus for them the contribution to perceived duration of the vowel predicts the shift in perceived phonetic identity. The other subjects' duration boundary shifts for vowel identification show greater scatter possibly indicating additional context‐conditioned effects in vowel‐identification performance beyond a mere change in the perceived duration value. [Research supported by NICHD Grant HD‐01994.]
FREE

Near‐perfect identification of speaker‐randomized vowels without consonantal transitions (A)

Daniel Kahn

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S101 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
While the two lowest formant frequencies, F1 and F2, are the most important cues in vowel identification, the correspondence between vowels and small regions in F1F2 space is far from one to one when data from many speakers is considered. This fact suggests that our vowel‐identification mechanism, quite reliable in ordinary speech situations, either (a) makes use of additional disambiguating cues, such as the frequencies of higher formants, vowel duration, degree of diphthongization, etc.; or (b) normalizes for individual‐speaker differences by deriving information on vocal‐tract geometry through exposure to consonantal formant transitions and known vowels; or both. The high vowel‐identification error rates observed when isolated vowels are speaker randomized (43% in a recently reported experiment) argue that (a) is relatively unimportant vís‐a‐vís (b), since in such experiments all the types of information referred to in (a) are retained, yet vowel identification is not reliable. In this paper I argue that there exists a systematic bias favoring hypothesis (b) over hypothesis (a) in past vowel‐identification tests and describe an experiment in which near‐perfect identification scores were achieved despite speaker randomization and the absence of consonantal transitions.
FREE

Selecting syllable nuclei in French (A)

Matthew Lennig

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S101 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Monophthongal vowels are often said to be specifiable in terms of their target positions [R. Houde, SCRL Monograph No. 2 (1968)]. In this paper criteria for selecting vowel nuclei within syllables of free‐running conversational French are described and evaluated for their ability to represent the impressionistic phonetic quality of measured vowels. Input to the nucleus selection algorithm is a syllable matrix whose rows contain the peak frequencies of the linear prediction model spectrum of the syllable at progressively later points in time. A fast method for eliminating gross errors (e.g., missing or extra formants) is proposed. This method is applied to each syllable matrix. Nuclei are then selected based on the steady‐state criterion: The point of minimum rate of change of the first and second formant frequencies. Other nucleus selection criteria, such as points of inflection and extrema of the formant frequencies are compared to the steady state criterion with respect to their representational ability of perceived similarities and differences of vowel timbre. [Work supported by NSF.]
FREE

Perception of synthetic vowels in various formant and duration combinations (A)

Edward N. Cohill, J. L. Danhauer, and G. Herman

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S101 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The vowels /i/, /a/, and /u/ were synthesized using a Nova minicomputer described earlier [G. Herman, J. Acoust. Soc. Am. 61, S68(A) (1977)]. Six normally hearing subjects made similarity judgments of the vowel stimuli which were constructed with the three formant frequencies published earlier for each vowel. The vowels had a rise‐decay time of 20 msec. The formant structures of the vowels varied for four experimental conditions: (1) all three formants present; (2) F1 and F2 present, F3 absent; (3) F1 and F3 present, F2 absent; and (4) F2 and F3 present, F1 absent. Vowel durations of 200 and 300 msec were used for each condition. Twenty‐four experimental conditions resulted and 576 stimulus pairs were used. Subjects listened at MCL and used an equal‐appearing interval scale to make their similarity judgments. Ratings were converted to similarity matrices and submitted to INDSCAL to find perceptual dimensions. Features retrieved related to tongue height, tension, and advancement, with duration having less importance.
FREE

A developmental study of speech production: Data on vowel imitation and sentence recitation (A)

R. D. Kent and L. Forner

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S101 (1977); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Ten subjects in each of five groups (4‐, 6‐, 10‐, and 12‐year‐old children as well as young adults) participated in a study involving the imitation of 15 synthesized vowels and the repetition of three simple sentences. This report emphasizes spectrographic data for the two youngest groups and the adult group. In the vowel imitation task, the subjects could recreate the relative formant structure of the targets, but the children in particular tended to produce allen (non‐English) vowels as the proximal English vowels. Plotting of the formant data for different age‐sex groups in the F1F2 plane revealed linear orientations, i.e., isovowel lines for unnormalized data. Standard deviations for the first two formant frequencies of the imitations were commensurate with published data on difference limens for formant frequency. The sentence repetitions were used primarily to evaluate the stability of temporal patterns. Both the inter‐ and intrasubject standard deviations of temporal segments tended to be larger for the children than for the adults, but even some 4‐year‐olds occasionally fell within the adult range of variability. [Work supported by NINCDS.]
FREE

A comparative study of the perception of species‐specific communication sounds by Japanese macaques (Macaca fuscata) and other Old World monkeys (A)

M. R. Petersen, M. D. Beecher, S. R. Zoloth, D. B. Moody, and W. C. Stebbins

J. Acoust. Soc. Am. Volume 62, Issue S1, pp. S101-S102 (1977); (2 pages)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
An attempt is being made to psychophysically characterize the Japanese macaque's perception of sounds drawn from its vocal communication system. In the course of constructing a vocal ethogram for this species, Green (Primate Behavier 4, 1–101 (1975)] identified several distinct classes of vocalizations which differ in acoustic structure and form as well as the sociobehavioral context in which they are likely to occur. Six fuscata and three control (non‐fuscata) monkeys have been operantly conditioned, via positive reinforcement techniques, to discriminate between two different types of vocal sounds from Green's classification scheme—the smooth early (SE) and smooth Late (SL) highs. While contacting a metal operandum with its hand an animal is presented a series of natural exemplars from the SL class. Occasionally, and randomly, an exemplar from the SL class is inserted into the series. If the animal correctly reports detection of the SE by breaking contact with the operandum it receives a food pellet. Results from several experiments employing this general procedure will be discussed including studies of perceptual constancy, selective attention, and the classification of synthetic versions of these communication signals. One important and consistent finding is that, although the fuscata and control animals have basically the same auditory capabilities, their perception of the communication sounds, at least during acquisition of the discrimination, differs: the fuscata seem to attend to the communication‐relevant features of the sounds while the controls appear to attend predominantly to the communication‐irrelevant aspects of the stimuli. [Supported by NSF Research Grants BMS 74‐20050 and 5‐27092, NINCDS Program Project Grant NS 5785, and NIGMS Training Grant GM‐01789.]
Close

close