Voice Identification System

This project is a Voice Identification System built using Python, leveraging SpeechBrain and ECAPA-TDNN for speaker verification. The system identifies users by comparing their voice embeddings with stored data, providing a secure and efficient method for user recognition.

Overview

The Voice Identification System is designed to recognize users based on their voice data. It uses state-of-the-art technologies like SpeechBrain and ECAPA-TDNN to extract robust speaker embeddings and perform accurate speaker verification. The system also features a Streamlit-based web app for seamless interaction, allowing users to upload audio files, verify their identity, and view results in real-time.

Project Goals

The main goals of this project are:

Accurate Speaker Recognition: Build a system that accurately identifies users based on their voice data.
Efficient Storage and Retrieval: Store and retrieve speaker embeddings efficiently for quick verification.
User-Friendly Interface: Create an intuitive web application using Streamlit for easy interaction.

Key Technologies

Python

Python serves as the core language for this project, offering flexibility and a rich ecosystem of libraries for machine learning, audio processing, and web development.

SpeechBrain

SpeechBrain is an open-source toolkit for speech processing. This project leverages SpeechBrain for feature extraction and speaker verification tasks.

ECAPA-TDNN

ECAPA-TDNN (Emphasized Channel Attention, Propagation, and Aggregation Time-Delay Neural Network) is a state-of-the-art neural network architecture for speaker recognition. It generates robust and discriminative speaker embeddings, which are used as unique voiceprints for each user.

Streamlit

Streamlit creates a user-friendly web interface. It allows users to upload audio files, view verification results, and navigate through different sections of the app seamlessly.

Project Workflow

Audio Upload: Users upload or record audio directly in the Streamlit app.
Embedding Extraction: SpeechBrain, powered by the ECAPA-TDNN model, extracts speaker embeddings from the uploaded audio file.
Database Matching:
- The extracted embeddings are compared against stored user embeddings in the database.
- The system retrieves and displays the user’s profile details if a match is found.
Verification Results: Verification results and user details are displayed on a dedicated page.

Implementation Details

Audio Processing

The system processes audio files uploaded through the Streamlit interface. It normalizes and prepares the audio for embedding extraction using SpeechBrain.

Speaker Embeddings

The core of the verification process lies in generating reliable speaker embeddings using the ECAPA-TDNN model. These embeddings serve as unique voiceprints for each user, enabling accurate speaker verification.

Database Storage and Retrieval

Extracted embeddings are stored in a SQLite database as binary blobs.
During verification, embeddings are retrieved and compared to new input embeddings using cosine similarity to confirm the user’s identity.

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
images		images
webpages		webpages
.gitattributes		.gitattributes
README.md		README.md
Voice_Recognition.db		Voice_Recognition.db
funcs.py		funcs.py
lstm_spoof_detector_model.keras		lstm_spoof_detector_model.keras
pagination.py		pagination.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Identification System

Table of Contents

Overview

Project Goals

Key Technologies

Python

SpeechBrain

ECAPA-TDNN

Streamlit

Project Workflow

Implementation Details

Audio Processing

Speaker Embeddings

Database Storage and Retrieval

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice Identification System

Table of Contents

Overview

Project Goals

Key Technologies

Python

SpeechBrain

ECAPA-TDNN

Streamlit

Project Workflow

Implementation Details

Audio Processing

Speaker Embeddings

Database Storage and Retrieval

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages