The most common used face recognition systems are easy to be attacked as listed in the National Vulnerability Database of the National Institute of Standards and Technology (NIST) in the US as mentioned in 1, one can spoof a face recognition system by presenting a photograph, a video, a mask or a 3D model of a targeted person in front of the camera. Although one can also use make-up or plastic surgery as other means of spoofing, photographs are probably the most common sources of spoofing attacks because one can easily download and capture facial images. Liveness detection research has a vital role in anti-spoofing of face recognition systems and biometrics systems
at all .As it detects the live person in front of the camera or it’s spoofed one, the person who tries to behave as someone else to access data.
According to the variety of spoofing attack methods a lot of liveness detection algorithms used as mentioned in 17-18, these methods can be grouped into four groups
1. User behavior modeling which detect mouth movement and eye blinking of the user.
2. User cooperation which ask the user to make some movement.
3. Hardware as generic cameras, but this is so expensive and hardly changed.
4. The final method based on data-driven characterization which our work depend on.
The first method user behavior modeling captures the user behavior with respect to acquisition sensor (e.g., eye blinking or small head and face movements) to determine whether a captured biometric sample is synthetic. In this method attack is detected based on eye blinking modeling under the assumption that a spoofed attack with photographs differs from valid access by the absence of movements.
The second method user cooperation is used to detect spoofing by asking challenging questions or by asking the user to perform specific movements which adds extra time and removes the naturalness inherent to biometric systems.
The third method that require additional hardware (e.g., infrared cameras or motion and depth sensors) use the extra information generated by these sensors to detect possible clues of an attempted attack.
The final method based on data-driven characterization looking for clues and artifacts that may detect attempted attack and exploit only the data captured by acquisition sensor. Methods that require additional hardware have the disadvantage of not being possible to implement in computational devices that do not support them, such as smartphones and tablets. In the data-driven categorization method we can again subdivide it into three approaches: frequency-based approaches, texture-based approaches and motion-based approaches.
1.1 Frequency-Based Approaches
1.1.1 Video-Based Spoofing Detection
Pinto et al. 19 proposed a method for detecting video-based spoofing attacks using visual rhythm analysis. This approach working on eliminating the noise in the video as the authors point of view is there is a noise signature that added to the biometric samples during the recapture process. So this approach used a low-pass filter to isolate the noise signals and to capture the temporal information of the video, and this is done by using visual rhythm technique.
This method is considered as the first method proposed for video-based spoofing attack detection.
The authors see that there is noise added to the videos in the recapture process (when using spoofed videos or mages the capture process be recapture), also
there are artifacts and noises are added to the biometric samples. Here, authors assumed that both noise and artifacts are enough to detect the spoofing attack.
The block diagram that illustrate this approach is shown in Fig. 3-1
As shown it this approach consist of five steps
1- Noise residual video is calculated for all videos in the training set.
2- Fourier spectrum is calculated for noise residual videos.
3- Then calculating the visual rhythms of each Fourier spectrum video.
4- Then visual rhythm is arranged as a texture map.
5- Then the Classifying step is the final one, it’s done by using Partial Least Squares (PLS) and Support Vector Machine (SVM).
By using Visual Rhythm the temporal information can be captured from videos and also and it summarize the video contents in a single image. There are two types of visual rhythm are generated for each video:
a) Vertical visual rhythm which formed by the central vertical lines
b) Horizontal visual rhythm which formed by the central horizontal lines
1.1.2 Liveness Detection Using Frequency Entropy
Lee et al. 20 proposed a method based on frequency entropy of images.
Fig. 3-2 shows the block diagram of frequency entropy of image sequences
As Shown this approach steps are as follow
1- Detect the Face region
2- RGB channels are calculated from detected face to obtain the time sequences of each color channel.
3- Elimination the cross-channel noise caused by interference from the environment and this is done by analyzing three RGB that detected by Independent Component Analysis (ICA)
4- Calculate the power spectrum. These power spectra are verified through entropy calculation and based on a threshold value the authors decide whether a biometric sample is synthetic or real to validate the liveness or spoofing attack.
1.1.3 Liveness Detection in Face Recognition Systems
Nalinakshi et al. 21 proposed a method for detecting liveness of the user with the help of local facial features like eye blinking, lip movement, forehead and chin movement pattern of the face detected with real-time generic web camera.
Fig. 3.3 shows the block diagram of this approach. This approach is done by the following steps
1- Detect the face
2- Extract texture feature by using local binary pattern (LBP)
3- Securely store the extracted feature vectors in database.
During the identification step, the templates stored in the database and generated feature vector of the user is compared by using template matching. Template matching is one-to-many matching which is carried out using Manhattan distance. Best matching of facial image is identified using the min Manhattan distance. If the matching is successful, then perform the liveness check using variations in local regions of facial features like eyes, lips, forehead and chin area. If there is any variation in these local features, then we can conclude that the user is alive. Otherwise user is not alive. Aliveness is calculated by taking the mean and standard deviation of each of the local regions. This method provides security in two phases: authentication and liveness checks.
1.2 Motion-Based Approaches
1.2.1 Fusion of Multiple Clues
Tronci et al. 22 proposed a method based on the motion information and clues that are take out from the scene by combining two types of processes, referred to as static and video-based analysis. The static analysis consists of combining different visual features such as color, edge, and Gabor textures. The video-based analysis combines simple motion-related measures such as eye blink, mouth movement, and facial expression change. The static analysis is used to find the abnormalities related to the input samples at verification process. The hypothesis is that differences are present in the visual data between images captured from real scene, and from photograph. These differences can be found directly from a single image or frame by frame if we are using a video. The video analysis combines the simple measures of movement. Fusion was carried out at score level by using a weighted sum. Photo detection gives a higher weight in combination. Movement measures contribute only very little weight
1.3 Texture-Based Approaches
1.3.1 Face Spoofing Detection Using Micro-Texture Analysis
Detection using the Local Binary Pattern (LBP). The authors make use of different LBP operators such as tLBP, dLBP and mLBP. Then by using ?2 histogram comparison, Linear Discriminant Analysis and Support Vector Machine histograms were classified from these descriptors. Fig. 2.4 illustrates face spoofing from single images using micro-texture analysis.
1.3.2 Face spoofing detection from single images using texture and local shape analysis
The extended approach of OULU University use LBP-based micro-texture analysis 2 by introducing two complementary low-level features to the face description, Gabor wavelets and HOG. This method applies two powerful texture features, LBPs and Gabor wavelets, for describing more macroscopic information and HOG for describing local shape analysis. The output of each method is fed to Linear SVM 6 classifier after applying homogeneous kernel map then apply score level fusion of the individual SVM outputs to determine whether the output is live or not and the final decision is computed by confusion of 3 outputs. The mentioned approach is applied on print attack dataset 12 with 0.999 AUC and 1.1 EER %
The work conducted on this area was highlighted in this chapter by an extensive literature survey on some of the existing Liveness detection. The following chapter shows a lot of experiments that done on three dataset comparing OULU with our enhanced approach.