Szczegóły publikacji

Opis bibliograficzny

Convolutive weighted multichannel Wiener filter front-end for distant automatic speech recognition in reverberant multispeaker scenarios / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2022 [Dokument elektroniczny] : September 18–22, Incheon, Korea. — Wersja do Windows. — Dane tekstowe. — [Seoul : The Acoustical Society of Korea], [2022]. — S. 2943–2947. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://isca-speech.org/archive/pdfs/interspeech_2022/fras22_... [2022-09-03]. — Bibliogr. s. 2947, Abstr.


Autorzy (3)


Słowa kluczowe

source separationspeech enhancementautomatic speech recognitiondereverberationoptimum filters

Dane bibliometryczne

ID BaDAP141901
Data dodania do BaDAP2022-09-06
DOI10.21437/Interspeech.2022-10780
Rok publikacji2022
Typ publikacjimateriały konferencyjne (aut.)
Otwarty dostęptak
Creative Commons
KonferencjaINTERSPEECH 2022

Abstract

The performance of automatic speech recognition (ASR) systems strongly deteriorates when the desired speech signal is contaminated with room reverberation and when the speech of interfering speakers overlaps. To achieve acceptable word error rates (WER) by distant ASR in multispeaker reverberant scenarios, source separation and dereverberation can be performed as front-end processing. An existing optimum filter suitable for this task is the recently proposed weighted power minimization distortionless response convolutional beamformer (WPD). In this paper, we introduce a novel speech enhancement frontend for improving the accuracy of back-end ASR in scenarios with multiple reverberant overlapping speakers. The convolutional weighted multichannel Wiener filter (CW-MWF) is optimum for the joint separation and dereverberation task, and it is derived from the convolutional weighted minimum mean square error (CW-MMSE) optimization criterion, presented recently by the current authors. The WER results of performed experiments indicate superior performance of the CW-MWF in real and simulated rooms, irrespective of the method used for filter parameter estimation and the DNN model used for backend ASR.

Publikacje, które mogą Cię zainteresować

artykuł
Convolutional weighted parametric multichannel Wiener filter for reverberant source separation / Mieszko FRAŚ, Konrad KOWALCZYK // IEEE Signal Processing Letters ; ISSN 1070-9908. — 2022 — vol. 29, s. 1928–1932. — Bibliogr. s. 1932, Abstr. — Publikacja dostępna online od: 2022-09-01
fragment książki
Joint blind source separation and dereverberation for automatic speech recognition using delayed-subsource MNMF with localization prior / Mieszko FRAŚ, Marcin WITKOWSKI, Konrad KOWALCZYK // W: INTERSPEECH 2023 [Dokument elektroniczny] : Dublin, Ireland, 20-24 August 2023 / [eds.] Naomi Harte, Julie Carson-Berndsen, Gareth Jones. — Wersja do Windows. — Dane tekstowe. — [France : ISCA], [2023]. — S. 3734–3738. — Wymagania systemowe: Adobe Reader. — Tryb dostępu: https://www.isca-speech.org/archive/pdfs/interspeech_2023/fra... [2023-09-19]. — Bibliogr. s. 3738, Abstr.