• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

The magnetic resonance imaging subset of the mngu0 articulatory corpus

J. Acoust. Soc. Am. Volume 131, Issue 2, pp. EL106-EL111 (2012); (6 pages)

Ingmar Steiner1, Korin Richmond2, Ian Marshall3, and Calum D. Gray4

1INRIA/LORIA Speech Group, Bat. C, 615 Rue du Jardin Botanique, 54600 Villers-lès-Nancy, France ingmar.steiner@inria.fr
2Centre for Speech Technology Research, University of Edinburgh, Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom korin@cstr.ed.ac.uk
3Medical Physics & Medical Engineering, University of Edinburgh, Chancellor’s Building, 49 Little France Crescent, Edinburgh, EH16 4SB, United Kingdom ian.marshall@ed.ac.uk
4Clinical Research Imaging Centre, University of Edinburgh, Queen’s Medical Research Institute, 47 Little France Crescent, Edinburgh, EH16 4TJ, United Kingdom calum.gray@ed.ac.uk

Full Text: Read Online (HTML) | Download PDF FREE | View Cart
This paper announces the availability of the magnetic resonance imaging (MRI) subset of the mngu0 corpus, a collection of articulatory speech data from one speaker containing different modalities. This subset comprises volumetric MRI scans of the speaker’s vocal tract during sustained production of vowels and consonants, as well as dynamic mid-sagittal scans of repetitive consonant–vowel (CV) syllable production. For reference, high-quality acoustic recordings of the speech material are also available. The raw data are made freely available for research purposes.

© 2012 Acoustical Society of America

Acknowledgments

Imaging was carried out at the Brain Research Imaging Centre, Edinburgh (http://www. bric.ed.ac.uk), Division of Clinical Neurosciences, University of Edinburgh, Western General Hospital, Edinburgh, a core area of the Wellcome Trust Clinical Research Facility and part of the SINAPSE collaboration (http://www.sinapse.ac.uk). This work was supported by Marie Curie Early Stage Training Site “EdSST” (MEST-CT-2005-020568) and EPSRC Grant No. EP/E027741/1 (“ProbTTS”).

Article Outline

  1. Introduction
  2. MRI data
    1. Volumetric scans
    2. Dynamic mid-sagittal scans
    3. Dental reconstruction
  3. Acoustic reference recordings
  4. Distribution

KEYWORDS and PACS

PACS

  • 43.70.Aj

    Anatomy and physiology of the vocal tract, speech aerodynamics, auditory kinetics

  • 43.70.Jt

    Instrumentation and methodology for speech production research

ARTICLE DATA

History
Received 28 Nov 2011
Accepted 13 Dec 2011
Published online 13 Jan 2012

PUBLICATION DATA

ISSN

0001-4966 (print)  

  1. Birkholz, P., and Kröger, B. J. (2006). “Vocal tract model adaptation using magnetic resonance imaging,” in Proceedings of the 7th International Seminar on Speech Production.
  2. Kitamura, T., Takemoto, H., Honda, K., Shimada, Y., Fujimoto, I., Syakudo, Y., Masaki, S., Kuroda, K., Oku-Uchi, N., and Senda, M. (2005). “Difference in vocal tract shape between upright and supine postures: Observations by an open-type MRI scanner,” Acoust. Sci. Technol. 26, 465–468.
  3. Munhall, K. G., Vatikiotis-Bateson, E., and Tohkura, Y. (1995). “X-ray film database for speech research,” J. Acoust. Soc. Am. 98, 1222–1224JASMAN000098000002001222000001. [ISI]
  4. Narayanan, S., Bresch, E., Ghosh, P. K., Goldstein, L., Katsamanis, A., Kim, Y., Lammert, A., Proctor, M., Ramanarayanan, V., and Zhu, Y. (2011). “A multimodal real-time MRI articulatory corpus for speech research,” in Proceedings of Interspeech, pp. 837–840.
  5. Richmond, K., Hoole, P., and King, S. (2011). “Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus,” in Proceedings of Interspeech, pp. 1505–1508.
  6. Westbury, J. R. (1994). X-Ray Microbeam Speech Production Database User's Handbook Version 1.0 (University of Wisconsin Press, Madison, WI).
  7. Wrench, A. A. (2000). “A multi-channel/multi-speaker articulatory database for continuous speech recognition research,” PHONUS 5, 1–14.

Figures (3) Multimedia (2) Tables (1)

Figures (click on thumbnails to view enlargements)

FIG.1
Cutaway volume rendering of raw volumetric data for [ɑ].

FIG.1 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

FIG.2
(Color online) 3D vocal tract extracted from [ɑ] scan, shown with surface rendering.

FIG.2 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

FIG.3
Overlay of 30 mid-sagittal frames of [m, n, ŋ] (rows) dynamically produced in vocalic context [i, ɑ u] (columns). The critical articulators for each consonant (lips, tongue tip, and tongue dorsum, respectively) achieve occlusion, whereas the tongue body assumes a different target shape in each vocalic context. The velum is lowered in all conditions, allowing the speaker to sustain production of the nasal.

FIG.3 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

Multimedia

Tables

Table I. Prompt lists for the MRI scanning session. The orthographic prompts are emphasized, with the underlined letter(s) corresponding to the target phone; the target phone itself is given in IPA notation.

View Table


Close

close