AI-guided signal processing for efficient, controllable, and interpretable audio enhancement – International Workshop on Acoustic Enhancement 2024

Special Session on “AI-Guided Signal Processing for Efficient, Controllable, and Interpretable Audio Enhancement”

Tuesday 10/9 – 10:00-12:00

Organizers: Pejman Mowlaee, Jesper Rindom Jensen and Tim Fingscheidt

Short Description: The focus of this special session to exploit domain-expertise to break down the audio enhancement problems, identifying meaningful ways of using machine learning in combination with traditional audio signal processing. In addition to enabling the use of smaller and more efficient machine learning models, it may be a key to bringing back the flexibility, controllability, and interpretability of signal processing approaches, while leveraging the robustness of data-driven approaches.

Researchers in the field are invited to submit papers on the following, non-exhaustive list of topics:

Combination of optimal filtering and machine-learning-based statistics estimation
Data-driven beamformer designs
Hybrid methods involving statistical signal processing and machine-learning-based approaches
Blind source separation and extraction guided by machine-learned models
Enhancement/extraction methods guided by machine learning (e.g., for target selection)

Session Papers

1009: DSP-INFORMED BANDWIDTH EXTENSION USING LOCALLY-CONDITIONED EXCITATION AND LINEAR TIME-VARYING FILTER SUBNETWORKS

Shahan Nercessian, Alexey Lukin, Johannes Imort

1018: DYNAMIC AUDIO-VISUAL SPEECH ENHANCEMENT USING RECURRENT VARIATIONAL AUTOENCODERS

Zohre Foroushi, Richard Dansereau

1029: TINY NEURAL-NETWORK CONTROL OF FREQUENCY-DOMAIN ADAPTIVE FILTERING FOR LINEAR SYSTEM IDENTIFICATION IN ACOUSTIC ECHO CANCELLATION

Svantje Voit, Gerald Enzner

1032: WEAKLY DOA GUIDED SPEAKER SEPARATION WITH RANDOM LOOK DIRECTIONS AND ITERATIVELY REFINED TARGET AND INTERFERENCE PRIORS

Alexander Bohlender, Ann Spriet, Wouter Tirry, Nilesh Madhu

1055: E-URES: EFFICIENT USER-CENTRIC RESIDUAL-ECHO SUPPRESSION FRAMEWORK WITH A DATA-DRIVEN APPROACH TO REDUCING COMPUTATIONAL COSTS

Amir Ivry, Israel Cohen

1073: Informed FastICA: Semi-Blind Minimum Variance Distortionless Beamformer

Zbynek Koldovsky, Jiri Malek, Jaroslav Cmejla, Stephen O’Regan

1116: Learning-based Multi-Channel Speech Presence Probability Estimation Using a Low-Parameter Model and Integration With MVDR Beamforming for Multi-Channel Speech Enhancement

Shuai Tao, Pejman Mowlaee, Jesper Rindom Jensen, Mads Græsbøll Christensen