• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Sine-wave speech recognition in a tonal language

J. Acoust. Soc. Am. Volume 131, Issue 2, pp. EL133-EL138 (2012); (6 pages)

Yan-Mei Feng1, Li Xu2, Ning Zhou3, Guang Yang4, and Shan-Kai Yin4

1Department of Otolaryngology, Shanghai Sixth People’s Hospital, Institute of Otolaryngology, Shanghai Jiao Tong University, Shanghai 200233, People’s Republic of China yanmeifeng2008@gmail.com
2School of Rehabilitation and Communication Sciences, Ohio University, Athens, Ohio 45701 xul@ohio.edu
3Kresge Hearing Research Institute, University of Michigan, Ann Arbor, Michigan 48109 nzhoujo@umich.edu
4Department of Otolaryngology, Shanghai Sixth People’s Hospital, Institute of Otolaryngology, Shanghai Jiao Tong University, Shanghai 200233, People’s Republic of China gyang321@hotmail.com, yinshankai@china.com

Full Text: Read Online (HTML) | Download PDF FREE | View Cart
It is hypothesized that in sine-wave replicas of natural speech, lexical tone recognition would be severely impaired due to the loss of F0 information, but the linguistic information at the sentence level could be retrieved even with limited tone information. Forty-one native Mandarin-Chinese-speaking listeners participated in the experiments. Results showed that sine-wave tone-recognition performance was on average only 32.7% correct. However, sine-wave sentence-recognition performance was very accurate, approximately 92% correct on average. Therefore the functional load of lexical tones on sentence recognition is limited, and the high-level recognition of sine-wave sentences is likely attributed to the perceptual organization that is influenced by top-down processes.

© 2012 Acoustical Society of America

Acknowledgments

The research was supported in part by Grant No. 81070778 from NSFC and NIH/NIDCD Grant No. R15-DC009504.

Article Outline

  1. Introduction
  2. Method
  3. Results and discussion

KEYWORDS and PACS

PACS

  • 43.71.Gv

    Measures of speech perception (intelligibility and quality)

ARTICLE DATA

History
Received 16 Aug 2011
Accepted 10 Nov 2011
Published online 18 Jan 2012

PUBLICATION DATA

ISSN

0001-4966 (print)  

  1. ANSI S3.6–2010. (2010). Specifications for audiometers (Acoustical Society of America, New York).
  2. Barker, J., and Cooke, M. (1999). “Is the sine-wave speech cocktail party worth attending?” Speech Commun. 27, 159–174. [Inspec]
  3. Boersma, P., and Weenink, D. (2010). “PRAAT: doing phonetics by computer (version 5.2.35) [computer program],” http://www.praat.org (Last viewed June 11, 2010).
  4. Carrel, T. D., and Opie, J. M. (1992). “The effect of amplitude comodulation on auditory object formation in sentence perception,” Percept Psychophys. 52, 437–445. [ISI] [MEDLINE]
  5. Darwin, C. (2003). “Sine-wave speech produced automatically using a script for the PRAAT program,” http://www.lifesci.sussex.ac.uk/home/Chris_Darwin/SWS/ (Last viewed December 8, 2011).
  6. Fant, G. (1960). Acoustic Theory of Speech Production (Mouton, The Hague).
  7. Fu, Q. J., and Zeng, F. G. (2000). “Identification of temporal envelope cues in Chinese tone recognition,” Asia Pac. J. Speech Lang. Hear. 5, 45–57.
  8. Fu, Q. J., Zeng, F. G., Shannon, R. V., and Soli, S. D. (1998). “Importance of tonal envelope cues in Chinese speech recognition,” J. Acoust. Soc. Am. 104, 505–510JASMAN000104000001000505000001. [ISI] [MEDLINE]
  9. Kong, Y. Y., and Zeng, F. G. (2006). “Temporal and spectral cues in Mandarin tone recognition,” J. Acoust. Soc. Am. 120, 2830–2840.
  10. Liang, Z. A. (1963). “The auditory perception of Mandarin tones,” Acta Physiol. Sinica 26, 85–91.
  11. Lin, M.C. (1988). “The acoustic characteristics and perceptual cues of tones in Standard Chinese,” Zhongguo Yuwen 204, 182–193.
  12. Luo, X., and Fu, Q. J. (2004). “Enhancing Chinese tone recognition by manipulating amplitude envelope: implications for cochlear implant,” J. Acoust. Soc. Am. 116, 3659–3667JASMAN000116000006003659000001. [ISI] [MEDLINE]
  13. Nittrouer, S., and Lowenstein, J. H. (2010). “Learning to perceptually organize speech signals in native fashion,” J. Acoust. Soc. Am. 127, 1624–1635JASMAN000127000003001624000001.
  14. Remez, R. E. (2008). “Perceptual organization of speech,” in The Handbook of Speech Perception, edited by D. B. Pisoni and R. E. Remez (Blackwell, Malden, MA).
  15. Remez, R. E., and Rubin, P. E. (1984). “On the perception of intonation from sinusoidal sentences,” Percept. Psychophys. 35, 429–440. [MEDLINE]
  16. Remez, R. E., Rubin, P. E., Berns, S. M., Pardo, J. S., and Lang, J. M. (1994). “On the perceptual organization of speech,” Psychol. Rev. 101, 129–156. [ISI] [MEDLINE]
  17. Remez, R. E., Rubin, P. E., Pisoni, D. B., and Carrell, T. D. (1981). “Speech perception without traditional speech cues,” Science 212, 947–949. [Inspec] [ISI] [MEDLINE]
  18. Thornton, A. R., and Raffin, M. J. (1978). “Speech discrimination scores modeled as a binomial variable,” J. Speech Hear. Res. 21, 507–518. [Inspec] [ISI] [MEDLINE]
  19. Wang, W., Zhou, N., and Xu, L. (2011). “Musical pitch and lexical tone perception with cochlear implants,” Int. J. Audiol. 50, 270–278.
  20. Whalen, D. H., and Xu, Y. (1992). “Information for Mandarin tones in the amplitude contour and in brief segments,” Phonetica 49, 25–47. [MEDLINE]
  21. Wong, L. L., Soli, S. D., Liu, S., Han, N., and Huang, M. W. (2007). “Development of the Mandarin Hearing in Noise Test (MHINT),” Ear Hear. 28, 70S–74S.
  22. Xu, L., Tsai, Y., and Pfingst, B. E. (2002). “Features of stimulation affecting tonal-speech perception: implications for cochlear prostheses,” J. Acoust. Soc. Am. 112, 247–258JASMAN000112000001000247000001. [ISI] [MEDLINE]
  23. Xu, Y. (2006). “Tone in connected discourse,” in Encyclopedia of Language and Linguistics, edited by K. Brown, 2nd ed. (Elsevier, Oxford), Vol. 12, pp. 742–750.
  24. Zhang, H. (1990). “Design and preliminary application of the minimal auditory capabilities in Chinese people,” Chinese J. Otorhinolaryngol. 25, 79–82.
  25. Zhang, N., Liu, S., Xu, J., Liu, B., Qi, B., Yang, Y., Kong, Y., and Han, D. (2010). “Development and applications of alternative methods of segmentation for Mandarin Hearing in Noise Test in normal-hearing listeners and cochlear implant users,” Acta Otolaryngol. 130, 831–837.

Figures (2) Tables (1)

Figures (click on thumbnails to view enlargements)

FIG.1
(Color online) Narrowband spectrograms of the natural speech materials (upper panels) and the sine-wave replicas of the same speech materials (lower panels). The left small panels show the spectrograms of four tone tokens (i.e., /ma1/, /ma2/, /ma3/, and /ma4/) of a male voice. The right panels show the spectrograms of an MHINT sentence (i.e., /wo3 men2 dou1 zai4 xue2 xi2 dian4 nao3 da3 zi4/ or “We are all learning how to type on a computer” in English) of a male voice.

FIG.1 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

FIG.2
Recognition scores of sine-wave Mandarin-Chinese tones and sentences. The three horizontal solid lines of the box represent the 25th percentile, median, and 75th percentile of the data. The whiskers show the range. The filled symbols represent individual data. The dashed line on the left indicates the chance level (i.e., 25% correct) for tone recognition.

FIG.2 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

Tables

Table I. Confusion matrix for the sine-wave tone recognition. Data were pooled across 41 subjects.

View Table


Close

close