Speaker Verification

Task

Given an input audio clip and a reference audio clip of a known speaker, the model’s objective is to compare the two clips and verify if they are from the same individual. This often involves extracting embeddings from a deep neural network trained on a large dataset of voices. The model then measures the similarity between these feature sets using techniques like cosine similarity or a learned distance metric. This task is crucial in applications requiring secure access control, such as biometric authentication systems, where a person’s voice acts as a unique identifier.

Please refer to the Vibravox paper for more information.

Testing code

Please follow this link to get the testing code of our model: https://github.com/jhauret/vibravox