Szczegóły publikacji

Opis bibliograficzny

Joint blind source separation and dereverberation for automatic speech recognition using delayed-subsource MNMF with localization prior / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2023 [Dokument elektroniczny] : Dublin, Ireland, 20-24 August 2023 / [eds.] Naomi Harte, Julie Carson-Berndsen, Gareth Jones. — Wersja do Windows. — Dane tekstowe. — [France : ISCA], [2023]. — S. 3734–3738. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://www.isca-speech.org/archive/pdfs/interspeech_2023/fra... [2023-09-19]. — Bibliogr. s. 3738, Abstr.


Autorzy (3)


Słowa kluczowe

source separationautomatic speech recognitiondereverberationnon negative matrix factorization

Dane bibliometryczne

ID BaDAP148719
Data dodania do BaDAP2023-09-20
DOI10.21437/Interspeech.2023-2520
Rok publikacji2023
Typ publikacjimateriały konferencyjne (aut.)
Otwarty dostęptak
KonferencjaINTERSPEECH 2023

Abstract

Overlapping speech and high room reverberation deteriorate the accuracy of automatic speech recognition (ASR). This paper proposes a method for jointly optimum source separation and dereverberation using delayed subsource multichannel nonnegative matrix factorization (MNMF). We formulate a subsource-based signal model that accounts for late room reverberation using time-delayed microphone signals from several past time frames. We then propose a maximum a posteriori (MaP) estimator based on MNMF with localization prior on the mixing matrix suitable for direct-path and reverberant signal components estimation. Finally, two algorithms are derived, namely the original and simplified delayed subsource MNMF, which are shown to outperform many state-of-the-art approaches. The results of experimental evaluations, performed using real and simulated data, indicate superior performance of the proposed processing in terms of the word error rate (WER) as well as signal-to-distortion ratio (SDR).

Publikacje, które mogą Cię zainteresować

fragment książki
Convolutive weighted multichannel Wiener filter front-end for distant automatic speech recognition in reverberant multispeaker scenarios / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2022 [Dokument elektroniczny] : September 18–22, Incheon, Korea. — Wersja do Windows. — Dane tekstowe. — [Seoul : The Acoustical Society of Korea], [2022]. — S. 2943–2947. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://isca-speech.org/archive/pdfs/interspeech_2022/fras22_... [2022-09-03]. — Bibliogr. s. 2947, Abstr.
fragment książki
Combating reverberation in NTF-based speech separation using a sub-source weighted multichannel Wiener filter and linear prediction / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2021 [Dokument elektroniczny] : proceedings of the 22nd annual conference of the International Speech Communication Association : 30 August–3 September 2021, Brno, Czechia. — Brno : Brno University of Technology, 2021. — (Interspeech : Proceedings of the ... Annual Conference of the International Speech Communication Association ; ISSN 1990-9772). — e-ISBN: 978-171383690-2. — S. 3895–3899. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://www.isca-speech.org/archive/pdfs/interspeech_2021/fra... [2021-09-21]. — Bibliogr. s. 3899. Abstr. — W bazie Scopus zakres stron: 2403-2407