Special Session on “AI-Guided Signal Processing for Efficient, Controllable, and Interpretable Audio Enhancement”
Tuesday 10/9 – 10:00-12:00
Organizers: Pejman Mowlaee, Jesper Rindom Jensen and Tim Fingscheidt
Short Description: The focus of this special session to exploit domain-expertise to break down the audio enhancement problems, identifying meaningful ways of using machine learning in combination with traditional audio signal processing. In addition to enabling the use of smaller and more efficient machine learning models, it may be a key to bringing back the flexibility, controllability, and interpretability of signal processing approaches, while leveraging the robustness of data-driven approaches.
Researchers in the field are invited to submit papers on the following, non-exhaustive list of topics:
- Combination of optimal filtering and machine-learning-based statistics estimation
- Data-driven beamformer designs
- Hybrid methods involving statistical signal processing and machine-learning-based approaches
- Blind source separation and extraction guided by machine-learned models
- Enhancement/extraction methods guided by machine learning (e.g., for target selection)
Session Papers
1009: DSP-INFORMED BANDWIDTH EXTENSION USING LOCALLY-CONDITIONED EXCITATION AND LINEAR TIME-VARYING FILTER SUBNETWORKS |
Shahan Nercessian, Alexey Lukin, Johannes Imort |
1018: DYNAMIC AUDIO-VISUAL SPEECH ENHANCEMENT USING RECURRENT VARIATIONAL AUTOENCODERS |
Zohre Foroushi, Richard Dansereau |
1029: TINY NEURAL-NETWORK CONTROL OF FREQUENCY-DOMAIN ADAPTIVE FILTERING FOR LINEAR SYSTEM IDENTIFICATION IN ACOUSTIC ECHO CANCELLATION |
Svantje Voit, Gerald Enzner |
1032: WEAKLY DOA GUIDED SPEAKER SEPARATION WITH RANDOM LOOK DIRECTIONS AND ITERATIVELY REFINED TARGET AND INTERFERENCE PRIORS |
Alexander Bohlender, Ann Spriet, Wouter Tirry, Nilesh Madhu |
1055: E-URES: EFFICIENT USER-CENTRIC RESIDUAL-ECHO SUPPRESSION FRAMEWORK WITH A DATA-DRIVEN APPROACH TO REDUCING COMPUTATIONAL COSTS |
Amir Ivry, Israel Cohen |
1073: Informed FastICA: Semi-Blind Minimum Variance Distortionless Beamformer |
Zbynek Koldovsky, Jiri Malek, Jaroslav Cmejla, Stephen O’Regan |
1116: Learning-based Multi-Channel Speech Presence Probability Estimation Using a Low-Parameter Model and Integration With MVDR Beamforming for Multi-Channel Speech Enhancement |
Shuai Tao, Pejman Mowlaee, Jesper Rindom Jensen, Mads Græsbøll Christensen |