• Volume/Page
  • Keyword
  • DOI
  • Citation
  • Advanced
   
 
 
 

Solving permutations in frequency-domain for blind separation of an arbitrary number of speech sources

J. Acoust. Soc. Am. Volume 131, Issue 2, pp. EL139-EL144 (2012); (6 pages)

Iván Durán-Díaz, Auxiliadora Sarmiento, Sergio Cruces, and Pablo Aguilera

Signal Theory and Communications Department, University of Seville, Camino de los Descubrimientos S/N, 41092, Seville, Spain duran@us.es, sarmiento@us.es, sergio@us.es, paguilera@us.es

Full Text: Read Online (HTML) | Download PDF FREE | View Cart
Blind separation of speech sources in reverberant environments is usually performed in the time-frequency domain, which gives rise to the permutation problem: the different ordering of estimated sources for different frequency components. A two-stage method to solve permutations with an arbitrary number of sources is proposed. The suggested procedure is based on the spectral consistency of the sources. At the first stage frequency bins are compared with each other, while at the second stage the neighboring frequencies are emphasized. Experiments for perfect separation situations and for live recordings show that the proposed method improves the results of existing approaches.

© 2012 Acoustical Society of America

Acknowledgments

This work was supported by Ministry of Science and Innovation of Spain through Project No. TEC2011-23559. We thank Emmanuel Vincent’s collaboration for the evaluation of the results.

Article Outline

  1. Introduction
  2. Signal model and notation
  3. Proposed method
    1. Improvement of the results
    2. Summary of the proposed algorithm
    3. Computational cost
  4. Simulations
    1. Performance in a perfect separation situation
    2. Performance for live recording
  5. Conclusions

KEYWORDS and PACS

PACS

  • 43.60.Pt

    Signal processing techniques for acoustic inverse problems

  • 43.60.Gk

    Space-time signal processing, other than matched field processing

  • 43.60.Np

    Acoustic signal processing techniques for neural nets and learning systems

ARTICLE DATA

History
Received 04 Oct 2011
Accepted 16 Dec 2011
Published online 23 Jan 2012

PUBLICATION DATA

ISSN

0001-4966 (print)  

  1. M. S. Pedersen, J. Larsen, U. Kjems, and L. C. Parra, “A survey of convolutive blind source separation methods,” in Springer Handbook of Speech Processing (Springer, Berlin, 2008).
  2. J. Anemüller and B. Kollmeier, “Amplitude modulation decorrelation for convolutive blind source separation,” in Proceedings of the Second International Workshop on Independent Component Analysis and Blind Signal Separation, (June, 2000), pp. 215–220.
  3. D. T. Pham, C. Serviére, and H. Boumaraf, “Blind separation of convolutive audio mixtures using nonstationarity,” in Proceedings of ICA 2003 Conference, Nara, Japan (April, 2003).
  4. L. Rabiner and B. H. Juang, Fundamentals of Speech Recognition (Prentice Hall, Englewood Cliffs, NJ, 1993).
  5. S. Curces, A. Cichocki, and S.-i. Amari, “From blind signal extraction to blind instantaneous signal separation,” IEEE Trans. Neural Networks 15(4), 859–873 (2004). [Inspec] [ISI] [MEDLINE]
  6. P. Tichavsky and Z. Koldovsky, “Optimal paring of signal components separated by blind techniques,” IEEE Signal Process. Lett. 11(2), 119–122 (2004).
  7. A. Sarmiento, I. Durán-Díaz, and S. Cruces, “Initialization method for speech separation algorithms that work in the time frequency domain,” J. Acoust. Soc. Am. 127(4), 121–126 (2010).
  8. K. Rahbar and J. P. Reilly, “A frequency domain method for blind source separation of con- volutive audio mixtures,” IEEE Trans. Speech Audio Process. 13(5), 832–844 (2005).

Figures (1) Multimedia (6) Tables (1)

Figures (click on thumbnails to view enlargements)

FIG.1
(Color online) Performance of the proposed algorithm in a situation of perfect separation when the number of sources are N = 3,…,7. The number of remaining permutations per simulation are represented for different cases.

FIG.1 Download High Resolution Image (.zip file) | Export Figure to PowerPoint

Multimedia

Tables

Table I. Results for a perfect separation situation with N = 2 sources. 30 simulations were made by applying a randomly selected permutation matrix to each frequency bin. The proposed algorithm was compared with two other (Refs. 3,8). The averaged number of remaining permutations (errors) per simulation and the number of simulations for which there is one remaining permutation at least are shown.

View Table


Close

close