Special Session on “AI-Guided Signal Processing for Efficient, Controllable, and Interpretable Audio Enhancement”

Tuesday 10/9 – 10:00-12:00

Organizers: Pejman Mowlaee, Jesper Rindom Jensen and Tim Fingscheidt

Short Description: The focus of this special session to exploit domain-expertise to break down the audio enhancement problems, identifying meaningful ways of using machine learning in combination with traditional audio signal processing. In addition to enabling the use of smaller and more efficient machine learning models, it may be a key to bringing back the flexibility, controllability, and interpretability of signal processing approaches, while leveraging the robustness of data-driven approaches.

Researchers in the field are invited to submit papers on the following, non-exhaustive list of topics:

  • Combination of optimal filtering and machine-learning-based statistics estimation
  • Data-driven beamformer designs
  • Hybrid methods involving statistical signal processing and machine-learning-based approaches
  • Blind source separation and extraction guided by machine-learned models
  • Enhancement/extraction methods guided by machine learning (e.g., for target selection)

Session Papers

1009: DSP-INFORMED BANDWIDTH EXTENSION USING LOCALLY-CONDITIONED EXCITATION AND LINEAR TIME-VARYING FILTER SUBNETWORKS
Shahan Nercessian, Alexey Lukin, Johannes Imort
1018: DYNAMIC AUDIO-VISUAL SPEECH ENHANCEMENT USING RECURRENT VARIATIONAL AUTOENCODERS
Zohre Foroushi, Richard Dansereau
1029: TINY NEURAL-NETWORK CONTROL OF FREQUENCY-DOMAIN ADAPTIVE FILTERING FOR LINEAR SYSTEM IDENTIFICATION IN ACOUSTIC ECHO CANCELLATION
Svantje Voit, Gerald Enzner
1032: WEAKLY DOA GUIDED SPEAKER SEPARATION WITH RANDOM LOOK DIRECTIONS AND ITERATIVELY REFINED TARGET AND INTERFERENCE PRIORS
Alexander Bohlender, Ann Spriet, Wouter Tirry, Nilesh Madhu
1055: E-URES: EFFICIENT USER-CENTRIC RESIDUAL-ECHO SUPPRESSION FRAMEWORK WITH A DATA-DRIVEN APPROACH TO REDUCING COMPUTATIONAL COSTS
Amir Ivry, Israel Cohen
1073: Informed FastICA: Semi-Blind Minimum Variance Distortionless Beamformer
Zbynek Koldovsky, Jiri Malek, Jaroslav Cmejla, Stephen O’Regan
1116: Learning-based Multi-Channel Speech Presence Probability Estimation Using a Low-Parameter Model and Integration With MVDR Beamforming for Multi-Channel Speech Enhancement
Shuai Tao, Pejman Mowlaee, Jesper Rindom Jensen, Mads Græsbøll Christensen