Szczegóły publikacji
Opis bibliograficzny
Maximum a posteriori estimator for convolutive sound source separation with sub-source based NTF model and the localization probabilistic prior on the mixing matrix / Mieszko FRAŚ, Konrad KOWALCZYK // W: ICASSP 2021 [Dokument elektroniczny] : 2021 IEEE International Conference on Acoustics, Speech and Signal Processing : June 6–11, 2021 virtual conference, Toronto, Ontario, Canada : proceedings. — Wersja do Windows. — Dane tekstowe. — Piscataway : The Institute of Electrical and Electronics Engineers, cop. 2021. — (Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing ; ISSN 1520-6149). — e-ISBN: 978-1-7281-7605-5. — S. 526–530. — Wymagania systemowe: Adobe Reader. — Bibliogr. s. 530, Abstr. — Publikacja dostępna online od: 2021-05-13
Autorzy (2)
Słowa kluczowe
Dane bibliometryczne
ID BaDAP | 136320 |
---|---|
Data dodania do BaDAP | 2021-09-22 |
Tekst źródłowy | URL |
DOI | 10.1109/ICASSP39728.2021.9413863 |
Rok publikacji | 2021 |
Typ publikacji | materiały konferencyjne (aut.) |
Otwarty dostęp | |
Wydawca | Institute of Electrical and Electronics Engineers (IEEE) |
Konferencja | 2021 IEEE International Conference on Acoustics, Speech and Signal Processing |
Czasopismo/seria | Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing |
Abstract
In this paper we present a method for the separation of sound source signals recorded using multiple microphones in a reverberant room. In particular, we propose a maximum a posteriori (MAP) estimator based on the multichannel nonnegative tensor factorization (NTF) model with the localization prior distribution on the mixing matrix, in which the latent data consists of the so-called sub-sources for an improved performance in a reverberant environment. For the proposed MAP estimator, we derive the sub-source based expectation maximization (EM) algorithm with the multiplicative update rules (MU) and the localization prior distribution (LP) on the mixing matrix (SSEM-MU-LP). We then perform several experiments for speech and instrumental sound sources recorded using two microphones, in determined and under-determined scenarios, and with different types of initialization of the model parameters. The results of these experiments clearly indicate a significant improvement of the proposed algorithm with the localization prior over the state-of-the-art NTF-based source separation algorithms, which can reach up to 50% in the signal-to-distortion ratio.