Speech Enhancement

Task

This task is mainly oriented towards denoising and bandwidth extension, also known as audio super-resolution, which is required to enhance the audio quality of body-conducted captured speech. The model is presented with a pair of audio clips (from a body-conducted captured speech, and from the corresponding clean, full bandwidth airborne-captured speech), and asked to enhance the audio by denoising and regenerating mid and high frequencies from low frequency content only.

Please refer to the Vibravox paper for more information.

Pre-trained models on HuggingFace

Please follow this link to go to the card of our EBEN models: https://huggingface.co/Cnam-LMSSC/vibravox_EBEN_models

Training code

Please follow this link to get the training code of our models: https://github.com/jhauret/vibravox

Audio Samples

Forehead In-ear Rigid In-ear Soft Temple Throat
Input
Enhanced by EBEN
Reference audio

Vibravox enhanced by EBEN

Explore all the test set enhanced by EBEN models :