A facial recognition system identifies or verifies a person’s identity by analyzing their face. The raw input for the camera is the person’s face recorded in real-time. The AI-enabled face recognition system captures the person’s image from the recorded video, analyzes it and compares it with images stored in its database. Facial recognition combined with a biometric fingerprint is useful for access control and prohibition of entry for unintended persons.
The first step is to install OpenCV. Run the following command line in your terminal :
We’ll create a new Jupyter notebook / python file and start off with :
Let's explore Cascade Classifiers.
Theory :- Cascade classifier, or namely cascade of boosted classifiers working with haar-like features, is a special case of ensemble learning, called boosting. It typically relies on Adaboost classifiers (and other models such as Real Adaboost, Gentle Adaboost or Logitboost). Cascade classifiers are trained on a few hundred sample images of image that contain the object we want to detect, and other images that do not contain those images.
How can we detect if a face is there or not ? There is an algorithm, called Viola–Jones object detection framework, that includes all the steps required for live face detection :
- Haar Feature Selection, features derived from Haar wavelets
- Create integral image
- Adaboost Training
- Cascading Classifiers
The original paper was published in 2001.
There are some common features that we find on most common human faces :
- a dark eye region compared to upper-cheeks
- a bright nose bridge region compared to the eyes
- some specific location of eyes, mouth, nose.
n this example, the first feature measures the difference in intensity between the region of the eyes and a region across the upper cheeks. The feature value is simply computed by summing the pixels in the black area and subtracting the pixels in the white area.
Then, we apply this rectangle as a convolutional kernel, over our whole image. In order to be exhaustive, we should apply all possible dimensions and positions of each kernel. A simple 24*24 images would typically result in over 160’000 features, each made of a sum/subtraction of pixels values. It would computationally be impossible for live face detection. So, how do we speed up this process ?
- once the good region has been identified by a rectangle, it is useless to run the window over a completely different region of the image. This can be achieved by Adaboost.
- compute the rectangle features using the integral image principle, which is way faster. We’ll cover this in the next section.
There are several types of rectangles that can be applied for Haar Features extraction. According to the original paper :
- the two-rectangle feature is the difference between the sum of the pixels within two rectangular regions, used mainly for detecting edges (a,b)
- the three-rectangle feature computes the sum within two outside rectangles subtracted from the sum in a center rectangle, used mainly for detecting lines (c,d)
- the four-rectangle feature computes the difference between diagonal pairs of rectangle (e)
Now that the features have been selected, we apply them on the set of training images using Adaboost classification, that combines a set of weak classifiers to create an accurate ensemble model. With 200 features (instead of 160’000 initially), an accuracy of 95% is acheived. The authors of the paper have selected 6’000 feature.