Machine Learning for Spatial Audio Processing (MLSAP)

Project title: Machine Learning for Spatial Audio Processing (MLSAP)
Project type: Research project
Funding: OPUS Program, National Science Centre (NCN)
Duration: 2019 – 2022

Project team members:
dr hab. inż. Konrad Kowalczyk, prof. AGH – principal investigator
mgr inż. Daniel Krause – student member
mgr inż. Mateusz Guzik – PhD student
dr inż. Stanisław Kacprzak – postdoctoral researcher

Project goal
The aim of this project is to study the deployment of classical signal processing methods and machine learning techniques that can be jointly applied to process audio and speech, and to propose novel techniques that merge the benefits of both approaches to provide an improved performance in terms of sound event detection and localization, signal extraction and classification of speech/audio signals. The proposed research should increase the understanding of what is achievable when combining these methods, which are typically used in isolation, or in the best case as two subsequent blocks in the processing chain. The project will utilize the knowledge from the field of audio signal processing and speech technology, in particular machine learning used in speaker recognition or source classification tasks.

Research topics of the project:
– Acoustic scene analysis using a microphone array
– Sound event detection and localization
– Speaker recognition with additional acoustic features
– Acoustic scene classification / acoustic source classification
– Spatial audio acquisition and reproduction over headphones and loudspeakers
– Incorporating machine learning into spatial audio processing

International project partner
University of Tampere, Finland

Publications
[C2] Daniel Krause, Archontis Politis, Konrad Kowalczyk, “Comparison of Convolution Types in CNN-based Feature Extraction for Sound Source Localization”, 28th European Signal Processing Conference (EUSIPCO 2020), ISSN: 978-9-0827-9705-3, pp.31-35, 2020.
[C1] Daniel Krause, Archontis Politis, Konrad Kowalczyk, “Feature Overview for Joint Modeling of Sound Event Detection and Localization Using a Microphone Array”, 28th European Signal Processing Conference (EUSIPCO 2020), ISSN: 978-9-0827-9705-3, pp.820-824, 2020.