Personalized Speech Enhancement: New Models and Comprehensive Evaluation
We present samples for our personalized speech enhancement (PSE) and unconditional speech enhancement (DCCRN) models. Personalized SE models can remove the interfering speakers in addition to the background noise.
We proposed two causal models that can run in real-time: personalized DCCRN (pDCCRN) and deep convolutional attention U-NET (pDCATTUNET). Please see our paper for detailed information.
Synthetic Data
These synthetic overlapped-speech samples contain clean speech from the VCTK corpus.
No processing
Clean reference
DCCRN
pDCCRN
pDCATTUNET
Real Data
The following recordings are from real-world cases; therefore, they lack clean reference audio.