Szczegóły publikacji

Opis bibliograficzny

An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction / Maximo Cobos, Jens Ahrens, Konrad KOWALCZYK, Archontis Politis // EURASIP Journal on Audio, Speech and Music Processing ; ISSN 1687-4714. — 2022 — vol. 2022 iss. 1 art. no. 10, s. 1–21. — Bibliogr. s. 17–21, Abstr. — Publikacja dostępna online od: 2022-05-16

Autorzy (4)

Cobos Maximo
Ahrens Jens
AGHKowalczyk Konrad
Politis Archontis

Słowa kluczowe

machine learning deep learning audio coding virtual reality array processing binaural audio ambisonics spatial audio scene analysis

Dane bibliometryczne

ID BaDAP	140197
Data dodania do BaDAP	2022-05-27
Tekst źródłowy	URL
DOI	10.1186/s13636-022-00242-x
Rok publikacji	2022
Typ publikacji	artykuł w czasopiśmie
Otwarty dostęp
Creative Commons
Czasopismo/seria	EURASIP Journal on Audio, Speech and Music Processing

Abstract

The domain of spatial audio comprises methods for capturing, processing, and reproducing audio content that contains spatial information. Data-based methods are those that operate directly on the spatial information carried by audio signals. This is in contrast to model-based methods, which impose spatial information from, for example, metadata like the intended position of a source onto signals that are otherwise free of spatial information. Signal processing has traditionally been at the core of spatial audio systems, and it continues to play a very important role. The irruption of deep learning in many closely related fields has put the focus on the potential of learning-based approaches for the development of data-based spatial audio applications. This article reviews the most important application domains of data-based spatial audio including well-established methods that employ conventional signal processing while paying special attention to the most recent achievements that make use of machine learning. Our review is organized based on the topology of the spatial audio pipeline that consist in capture, processing/manipulation, and reproduction. The literature on the three stages of the pipeline is discussed, as well as on the spatial audio representations that are used to transmit the content between them, highlighting the key references and elaborating on the underlying concepts. We reflect on the literature based on a juxtaposition of the prerequisites that made machine learning successful in domains other than spatial audio with those that are found in the domain of spatial audio as of today. Based on this, we identify routes that may facilitate future advancement.

Publikacje, które mogą Cię zainteresować

artykuł

Data-based spatial audio processing / Maximo Cobos, Jens Ahrens, Konrad KOWALCZYK, Archontis Politis // EURASIP Journal on Audio, Speech and Music Processing ; ISSN 1687-4714. — 2022 — vol. 2022 iss. 1 art. no. 13, s. 1–3. — Publikacja dostępna online od: 2022-06-08

Szczegóły