• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Journal of the Acoustical Society of America

Year Range: 
Search Issue | RSS Feeds RSS
Previous Issue Next Issue

Apr 1980

Volume 67, Issue S1, pp. S1-S103

back to top
RSS Feeds
back to top Session Q. Speech IV: Segmental Features II: Duration, Pitch, and Intensity
Contributed Papers
FREE

Central representation of vowel duration (A)

J. R. Westbury and P. A. Keating

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S37-S37 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
For all languages for which there are data, the durations of vowels (of the same category) vary directly with the degree of tongue and mandible excursion specific to their articulation. This fact suggests that the durations of qualitatively different vowels can be explained in terms of peripheral constraints on articulatory movements. It is believed, therefore, that the underlying control specifications for vowels differing both temporally and in displacement require actual control only of the latter parameter. An explicit articulatory model of vowel production consistent with this view [B. E. F. Lindblom, STL‐QPSR‐4 (KTH), 1–29 (1967)] accounts for systematic variation in time and displacement by varying only the magnitude of force applied to articulators. New measures of acoustic vowel duration, mandible excursion, and EMG from the anterior belly of digastric are reported for two normal adult English speakers repeating isolated nonsense monosyllables. These results show, as expected, that vowel duration varies directly with mandible displacement. But, these results show also that both quantities vary directly with the magnitude and duration of activity in the jaw‐lowering musculature. We interpret these data to indicate that the relative durations of vowels articulated with different mandible displacements are centrally represented. [Research supported by grants from NINCDS.]
FREE

A temporal model of speech production (A)

F. Bell‐Berti and B. Tuller

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S37-S37 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Attempts to model speech production processes based on abstract static units, such as syllables, phonemes, or features, fail to account for a number of observations of natural speech. For example. a phonetic feature such as lip rounding has been described as being anticipated varying numbers of consonant segments before the vowel for which it is specified. Such data conform to predictions made with models assuming a CnV syllable structure. Such models, however, fail to predict nasality anticipated in vowels preceding nasal consonants. On the other hand, feature‐based models that apparently account for observations of anticipated nasality fail to predict other coarticulatory phenomena, such as the suppression of lip‐rounding or tongue‐fronting activity during the production of an intervening consonant presumably containing no conflicting articulatory gesture. A model assuming that speech is specified segmentally with fixed timing relationships between the end and beginning of successive segments (and variable durations of dynamic gestures) can eliminate these, and other, discrepancies between predicted and observed behavior. [Work supported by NINCDS and BRSG grants.]
FREE

A durational and spectral analysis of simple French words (A)

D. O'Shaughnessy

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S37-S37 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Intelligible and natural‐sounding synthetic speech requires accurate specification of phoneme durations and spectra. Toward a goal of French synthetic speech‐by‐rule using a formant synthesizer, a set of 285 words was read in frame sentences by a French Canadian and analyzed via wide‐band spectrograms for durations, formants, and bandwidths. Phoneme durations were found to vary widely with primary dependence on the following factors: for vowels—height, nasality, and voicing and manner of articulation of ensuing consonants; for consonants—voicing, manner of articulation, and voicing and manner of articulation of adjacent consonants (in consonant clusters); for both vowels and consonants—position of syllable within the word and number of syllables in the word. A generative model of French phoneme durations in stressed words will be presented, along with a model for time trajectories of formants and bandwidths in terms of targets and smoothing rules.
FREE

Backward normalization in the perception of temporal speech cues (A)

Jorgen L. Pind

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S38-S38 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Two experiments will be reported which have investigated, for Icelandic, the effects of number of subsequent syllables in a word on the perception of vowel quantity (cued by duration) and preaspiration (cued by voice‐offset time). The first experiment randomly varied the duration of the stressed (first syllable) vowel and the number of syllables (one, two, or three) in a word. Results showed that the vowel duration needed to cue a long vowel depended on the number of syllables in the test word. The second experiment investigated the value of voice‐offset time needed to cue preaspiration in the first syllable of one‐, two‐ or three‐syllable words. Preliminary analysis shows backward normalization similar to but smaller than that found for vowels. Both experiments demonstrate that the perceived category of a phone can depend on information occurring after the immediately following syllable. [Work supported by the Icelandic Science Foundation.]
FREE

Speaking clearly: Acoustic characteristics and intelligibility of stop consonants (A)

F. R. Chen, V. W. Zue, M. A. Picheny, N. I. Durlach, and L. D. Braida

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S38-S38 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
A current study [M. A. Picheny et al., J. Acoust. Soc. Am. Suppl. 1 67, S38 (1980)] has indicated that “clear speech,” defined as the speech spoken with specific instructions to enunciate clearly, may be substantially more intelligible when presented to hard of hearing listeners than “conversational speech.” The purpose of this study is to determine segmental differences in the acoustic properties of stop consonants enunciated “clearly” and “conversationally” and to correlate these differences with differences in intelligibility. The corpus consisted of 18 CV syllables (six stop consonants and three vowels) embedded in a carrier phrase spoken several times by three male speakers under the two conditions, yielding 540 tokens. The CV's were excised from the carrier phrase and presented to listeners with simulated hearing loss for the purpose of intelligibility testing. In addition, acoustic parameters, such as formant frequencies, burst intensity, CV ratios, and formant transition rates were measured. Preliminary results of acoustical analysis of the speech waveforms and of intelligibility tests will be presented.
FREE

Speaking clearly: Intelligibility and acoustic characteristics of sentences (A)

M. A. Picheny, N. I. Durlach, and L. D. Braida

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S38-S38 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In a recent study [M. A. Picheny and N. I. Durlach, J. Acoust. Soc. Am. Suppl. 1 65, S135(A) (Spring 1979)], the intelligibility of speech spoken in a conversational manner versus speech spoken while consciously attempting to increase one's clarity was examined. Four hearing‐impaired listeners were tested on groups of 50 Harvard PB sentences enunciated by a single speaker and presented at listener‐selected levels. For all four listeners, the intelligibility of the clear speech was substantially higher than that for the conversational speech (average of 18 percentage points). This report will present preliminary results on the intelligibility of syntactically normal but semantically ambiguous sentences [P. W. Nye and J. H. Gaitenby, Haskins Laboratory Status Report on Speech Research SR‐37/38 (1974)] spoken clearly and conversationally to hearing‐impaired listeners and normal‐hearing listeners with simulated losses. In addition, preliminary acoustical analyses of the two types of speech materials will be presented.
FREE

A reaction‐time study of the production of [s] and [š] (A)

David Isenberg

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S38-S38 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Talkers produced steady‐state vowels until they heard an auditory go‐signal when they initiated articulation of [s] or [š] as rapidly as possible. Talkers knew the target fricative in advance. The reaction‐time, measured from the go‐signal to the cessation of glottal pulsing, was faster for [s] than for [š] by 11.5 ms following the vowels [i, a, u, ʌ]. This result can be interpreted as supporting the claim that [š] demands more precise control of the tongue tip than [s] [J. S. Perkell, S. E. Boyce, and K. N. Stevens, J. Acoust. Soc. Am. Suppl. 1 65, S24 (1979)]. Preliminary data suggest that [s] is not faster than [š] following the vowel [ɝ]‐since both [š] and [ɝ] are thought to involve similar types of anterior tongue movement, this observation is a further indication that tongue tip control is the crucial factor that slows [š] articulation following [i, a, u, ʌ]. Measures of spectral change were applied to these utterances to examine in detail the time course of events between the go‐signal and the fricative steady state. [Research supported by a grant from NINCDS.]
FREE

Imitation of /s/ duration in VCV's (A)

E. Karno and R. J. Porter, Jr.

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S38-S38 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
V1CV2 syllables were created by editing naturally produced segments: V1 was always a 412‐ms /a/; V2 was a 300‐ms /ε/, a 315‐ms /i/, or a 390‐ms /u/; C was an (unfounded) steady‐state /s/ segment with one of seven durations, from 116 to 416 ms, in 50‐ms steps. All V2 and /s/‐duration combinations were presented six times, randomly, to subjects who were instructed to listen carefully to each stimulus and then to imitate it, taking care to produce the same segment durations, vowel qualities, etc., as in the stimulus. Acoustic segments' durations in the responses were measured using oscillographic displays. EMG activity for the obicularis oris were also obtained to address the question of when (in time) coarticulatory activity began relative to acoustic onset of /u/. Subjects were remarkably sensitive to stimulus variation but showed some tendency to undershoot target durations. EMG onset varied with subjects and /s/ duration, suggesting variation in response strategies as a function of these variables. Results will be discussed in terms of their implications for the interrelation of perceptual and productive mechanisms. [Work supported in part by NIH BRSG RRO7196‐03 and The Kresge Foundation.]
FREE

Cue trade offs of spoken fricatives: Duration and intensity of the noise and vocalic formant transitions (A)

George P. McCasland

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S39-S39 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Spoken /s/ was manipulated using both PCM and splicing techniques, and perceptual results were obtained from a group of native American subjects. Prevocalic /s/ noise as in /sa, si/ must have sufficient intensity and duration to override a /θ/ percept produced at weak noise levels by the conflicting place cue of the natural vocalic transitions. In one intervocalic context determined by a final /i/ vowel as in /sasi, sisi/, the /s/ requires even greater noise intensity and duration to override the conflicting labial cue of the combined arresting and releasing transitions, which produces an /f/percept with weak noise. Postvocalic /s/ as in /has, his/ requires the least noise energy because the vocalic transitions contribute to the place impression of the /s/ percept. In all of these contexts, a shorter, more intense /s/ noise segment has the same cue strength as some longer, less intense segment in overriding conflicting transition cues. This investigation extends work on fricatives previously reported by the author [J. Acoust. Soc. Am. 63, S21(A) (1978); 65, S78(A) (1979); and 66, S88(A) (1979)]. [Haskins Laboratories has provided valuable assistance in the preparation of stimuli on Contract No. N1H‐71‐2420, National Institutes of Health, HEW.]
FREE

The duration of normally articulated and functionally misarticulated [s] in preschool children (A)

Gary Weismer and Mary Elbert

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S39-S39 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
The duration of [s] was measured in two groups of children aged 4–6 years; one of these groups contained children with “normal” [s] articulation whereas the other group contained children with functional misarticulation (i.e., misarticulation which has no known organic etiology) of the [s] sound. The speech sample consisted of [s] sounds embedded within 36 nonsense sequences the form of which was variously mono or bisyllabic. Within these sequences [s] was varied with respect to position and phonetic context. Each one of the sequences was produced in the carrier phrase I like to say _____ and was replicated six times by each subject. Measurements of [s] duration were derived from oscilloscopic displays, and the resulting data were examined for trends of central tendency and dispersion. One interesting result was that average intrasubject variabilities were greater for the misarticulating, as compared to normally articulating children. This finding is considered relative to potential motor control deficits which may be associated with functional misarticulation. [Work supported by NINCDS Award No. PHS R01 15551‐01.]
FREE

Tongue contact duration for /t/ and /d/ (A)

Spencer Haynes and Sandra Hamlet

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S39-S39 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Tongue contact durations for /t/ and /d/ were investigated in word‐initial position. Meaningful sentences containing the relevant phonetic contexts were spoken by four native speakers of American English. Tongue contact durations were measured by means of electropalatography. The microphone signal was also recorded. The signals were displayed together on two trace oscillograms run off at a paper speed such that each millisecond was represented by at least 1 mm of paper. Thus temporal measurements were highly accurate. Results indicated marked intersubject variability. For some subjects there was a strong relationship between consonant type and tongue contact duration, but the voiceless cognate was not always the longer. In other subjects the relationship was influenced by the following vowel. For one subject no significant difference was found between tongue contact duration for the two consonants regardless of the following vowel. A measurement of the period of voicelessness for /t/, from the cessation of voicing to the plosive burst, was also made. There were strong correlations between tongue contact durations and the period of voicelessness for only one subject. Although the burst was a good indicator of tongue release, as expected, there was no reliable spectral indicator of onset of tongue contact, particularly for /t/ if voicing ceased prior to contact. [Work supported by NIH Grant DE 03631.]
FREE

Acoustical analysis of 30 words judged to be different in pitch and corresponding to a pitch modal (A)

Celia S. Bessel and Carl W. Asp

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S39-S39 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In a previous study by Asp, Berry. and Bessel [J. Acoust. Soc. Am. Suppl. 1 64, S20(A) (1978)], listeners used a paired‐comparison procedure to judge the pitch of 30 monosyllabic words. These judgments were used to rank order the words from low to high pitch. The rank order agreed with the proposed pitch model used for selecting the words, which included low, middle, and high pitch categories. In the present study, these 30 words were analyzed with a spectrograph and a frequency analyzer for fundamental and formant frequencies, duration, and amplitude spectra. There was a significant correlation between the amplitude spectra and the rank order; visual inspection of the spectra revealed specific patterns associated with the three pitch categories. There was a significant correlation between formant two and the rank order; however, the fundamental frequency was not significantly correlated with the rank order. [Work supported by NIH Biomedical Science Support Grant.]
FREE

Acoustic parameters of plain and pharyngal fricatives in Lybian Arabic (A)

Laurence J. Krieg, Peter J. Benson, and J. C. Catford

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S39-S39 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
Arabic fricatives, particularly those differentiated by pharyngalization, still present unanswered questions to phonetic investigators: the acoustic cues differentiating the contrasts have only been partially specified [e.g., al‐Ani, Arabic Phonology, The Hague (1970)]; it has not been satisfactorily demonstrated whether fricative noise itself or only the surrounding vowels differentiate between corresponding pairs of pharyngalized (emphatic) and nonpharyngalized fricatives [e.g., Jakobson “‘Mufaxxama’…”, in Selected Writings, Vol. 1, pp. 510–522 (1962); Bonnot, “Recherche expérimentale sur la nature des consonnes emphatiques de l'arabe classique,” Traveaux de l'Institute Phonétique de Strasbourg 9, 47–88 (1977)]. A dialect of Lybian Arabic which, according to native linguists, contrasts an unusually large number of plain versus pharyngalized fricatives [s s̵ ʃ ⨏ χ χ̸ h; ℏ Ð Ð̸] is recorded; measurement is made of time‐domain variables such as VOT, rise time, and relative amplitude of the fricative. Spectral information is averaged together to provide representative spectra of each of the contrastive fricatives. These spectra are then transformed using various weighting functions; the original, the transformed spectra, and time‐domain measurements are then compared and grouped by multivariate analysis techniques to provide evidence for the general phonological groupings of fricatives in this dialect and the role of the pharyngalized/plain contrast.
FREE

Two‐moras‐cluster as a rhythm unit in spoken Japanese sentence or verse (A)

R. Teranishi

J. Acoust. Soc. Am. Volume 67, Issue S1, pp. S40-S40 (1980); (1 page)

Online Publication Date: 11 Aug 2005

Full Text: | Download PDF

Show Abstract
In spoken Japanese sentence, mora is regarded as a temporal unit because of its isochronous tendency. However, in slowly spoken Japanese sentence or verse, two moras tend to be read as one cluster and the isochrony of more looks gone. So the cluster comes up as a rhythm unit. This tendency is quite clear in recitation of verse. Some people have taken notice to such a phenomenon, but nobody confirmed it by physical measurement. Here, durations of moras are measured systematically according to speaking speed, in order to study how the cluster consists of two moras. As a result, it is found out that a waiting interval comes up between the second mora in the cluster and the first mora of the next cluster, whenever the speed comes fairly slow. During the waiting interval, voicing action usually still continues. So, the apparent duration of the second more looks longer than the first one. This is the reason why the waiting intervals have never been revealed unless they are studied and measured systematically.
Close

close