Skip to content

ARoyyanF/deepface-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notebook explanation

1. The Core Task: Face Verification

The fundamental method being tested is face verification, which answers the question: "Do these two pictures belong to the same person?" To test this, the system needs pairs of images with known outcomes (ground truth).

The notebook uses the "Asian Celebrity" dataset, which is structured with one folder for each person. From this, two types of test pairs are dynamically created:

Positive Pairs (Match): Two different images are randomly selected from the same person's folder. The ground truth label for this pair is 1 (or "Same").

Negative Pairs (No-Match): Two images are randomly selected from two different people's folders. The ground truth label for this pair is 0 (or "Different").

In this notebook, 2,000 such pairs (1,000 positive and 1,000 negative) were generated to form a balanced test set.

2. The Evaluation Process

For every model being tested (like VGG-Face, ArcFace, Facenet, etc.), the notebook performs the following evaluation loop:

Iterate Through Pairs: The model processes each of the 2,000 image pairs one by one.

Verification Function: For each pair, it uses the DeepFace.verify() function. This function analyzes both images, calculates a distance score (typically using 'cosine' similarity) to determine how similar the faces are, and returns a verified result (True or False) based on a pre-defined threshold for that specific model. This True/False output is the model's prediction.

Measure Speed: The time it takes for the model to process each pair is recorded. This is later averaged to determine the model's processing speed.

3. Calculating Performance Metrics

After all 2,000 pairs have been processed by a model, its performance is calculated by comparing its predictions against the ground truth labels created in the first step. This results in the following key metrics:

Accuracy: What percentage of the 2,000 pairs did the model classify correctly (both matches and non-matches)? This gives a general score of its overall performance.

Precision: Of all the pairs the model predicted as a "match," what percentage was correct? This measures the reliability of the model's positive predictions, answering "When it says it's a match, how often is it right?"

Recall: Of all the actual "match" pairs that existed in the test set, what percentage did the model successfully find? This measures the model's ability to identify all true matches.

F1-Score: This is the harmonic mean of Precision and Recall. It provides a single, balanced score that is especially useful when there's an uneven class distribution or when both false positives and false negatives are important.

4. Compiling the Final Results

This entire process is repeated for every model listed in the notebook's configuration. The calculated metrics (Accuracy, Precision, Recall, F1-Score) and the average processing time for each model are compiled into a single table (a pandas DataFrame).

This final table is the direct input for Section 6: Results Comparison and Visualization of the notebook. The charts and graphs in that section are simply visual representations of this compiled data, making it easy to compare the models' trade-offs between accuracy, reliability, and speed.

image image image image image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors