Szczegóły publikacji

Opis bibliograficzny

Joint blind source separation and dereverberation for automatic speech recognition using delayed-subsource MNMF with localization prior / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2023 [Dokument elektroniczny] : Dublin, Ireland, 20-24 August 2023 / [eds.] Naomi Harte, Julie Carson-Berndsen, Gareth Jones. — Wersja do Windows. — Dane tekstowe. — [France : ISCA], [2023]. — S. 3734–3738. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://www.isca-speech.org/archive/pdfs/interspeech_2023/fra... [2023-09-19]. — Bibliogr. s. 3738, Abstr.

Autorzy (3)

Słowa kluczowe

source separation automatic speech recognition dereverberation non negative matrix factorization

Dane bibliometryczne

ID BaDAP	148719
Data dodania do BaDAP	2023-09-20
DOI	10.21437/Interspeech.2023-2520
Rok publikacji	2023
Typ publikacji	materiały konferencyjne (aut.)
Otwarty dostęp
Konferencja	INTERSPEECH 2023

Abstract

Overlapping speech and high room reverberation deteriorate the accuracy of automatic speech recognition (ASR). This paper proposes a method for jointly optimum source separation and dereverberation using delayed subsource multichannel nonnegative matrix factorization (MNMF). We formulate a subsource-based signal model that accounts for late room reverberation using time-delayed microphone signals from several past time frames. We then propose a maximum a posteriori (MaP) estimator based on MNMF with localization prior on the mixing matrix suitable for direct-path and reverberant signal components estimation. Finally, two algorithms are derived, namely the original and simplified delayed subsource MNMF, which are shown to outperform many state-of-the-art approaches. The results of experimental evaluations, performed using real and simulated data, indicate superior performance of the proposed processing in terms of the word error rate (WER) as well as signal-to-distortion ratio (SDR).

Publikacje, które mogą Cię zainteresować

fragment książki

Convolutive weighted multichannel Wiener filter front-end for distant automatic speech recognition in reverberant multispeaker scenarios / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2022 [Dokument elektroniczny] : September 18–22, Incheon, Korea. — Wersja do Windows. — Dane tekstowe. — [Seoul : The Acoustical Society of Korea], [2022]. — S. 2943–2947. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://isca-speech.org/archive/pdfs/interspeech_2022/fras22_... [2022-09-03]. — Bibliogr. s. 2947, Abstr.

Szczegóły

fragment książki

Combating reverberation in NTF-based speech separation using a sub-source weighted multichannel Wiener filter and linear prediction / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2021 [Dokument elektroniczny] : proceedings of the 22nd annual conference of the International Speech Communication Association : 30 August–3 September 2021, Brno, Czechia. — Brno : Brno University of Technology, 2021. — (Interspeech : Proceedings of the ... Annual Conference of the International Speech Communication Association ; ISSN 1990-9772). — e-ISBN: 978-171383690-2. — S. 3895–3899. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://www.isca-speech.org/archive/pdfs/interspeech_2021/fra... [2021-09-21]. — Bibliogr. s. 3899. Abstr. — W bazie Scopus zakres stron: 2403-2407

Szczegóły