Single-Frame Facial Liveness Detection

How It Works to Reduce User Friction and Abandonment

This article provides key points from a white paper, available here.

The ability to ascertain facial liveness from a single-image frame enables high-performance presentation attack detection without adding friction to the user experience. A “passive” approach to liveness that is frictionless for the user eliminates the customer abandonments caused by the complications of an “active” approach. The result is fewer lost customers and more revenue.

What is passive facial liveness?

In remote identity verification and authentication processes that use biometrics, “liveness detection” is critical in preventing presentation attacks, or “spoofs”. Approaches to liveness detection can be categorized as either active or passive. An active approach relies on interaction with the user, such as by instructing them to blink, smile, turn their head, or move their device while on camera and then detecting their reaction. In contrast, a passive liveness technique requires no instruction, commands, or response from the user.

How single-frame liveness works

The first thing to understand about single-frame liveness is that there are features of a digital image that–while not easily visible to the naked eye–can be detected and measured using computer vision. Artificial intelligence encompasses several machine learning techniques, many of which involve training of a deep neural network to perform as a classifier. The training process involves using ground-truth input and output data to train the network on what the correct output is for a given input.

The more ground-truth data that can be used for training, the better. But a second thing that’s helpful towards understanding single-frame liveness is that the accuracy of the neural network relies not just on the quantity of training data, but on its quality. In other words, garbage in, garbage out; training with low-quality data can actually degrade its performance. What determines the quality of ground truth data? It should be correctly tagged. It should be representative of cases found in the real world. It should address corner cases, and it should avoid imposing biases on its results.

Detecting "Printed Copy" Presentation Attacks — click to enlarge

Training neural networks is art as much as science; there are always techniques data scientists can discover and apply to further optimize performance. For liveness detection, this might mean not just training the network to determine “live or spoof?”, but rather “printed on paper?”, “displayed on screen?”, or “edges detected?”. To make these assessments, subtle image features can be used as input to the networks to inform these decisions, such as moiré effect, color spectrum, and others, and then weighing multiple results to determine a final conclusion. ID R&D has invested many person-years of research to determine what image features should be examined and how they should be combined, and is continuously making improvements. IDLive Face can be tuned to optimize performance for different environments using calibration, which does not require modifications to the neural network design.

How eliminating friction adds new customers and value

In liveness detection, APCER is the rate of error in detecting a presentation attack. But given the inherent tradeoff between false-negative and false-positive errors, a low APCER can come at the cost of a high BPCER, the error rate in classifying bona fide customers as legitimate.

As discussed, an “active” liveness detection approach relies upon interactions with the user to help assess liveness, while a “passive” approach is transparent to the user, and typically uses only the same images used for biometric comparison. A BPCER can be made worse by the friction introduced by an active liveness technique. Frustration, distraction, and errors in interpreting or executing upon instructions can all increase the frequency of interruptions and failures, and can be particularly impactful in a digital onboarding process, where users are new and performing tasks for the first time. Furthermore, user friction introduces variables of human behavior that are difficult to anticipate and measure, so the BPCER observed in a real-world deployment of a high-friction solution can be higher than planned for, and the difference can be significant.

A case study illustrates the ROI from upgrading from active liveness to passive, single-frame, frictionless liveness. New customer applications went from a 60% completion rate to over a 95% rate. This means that over a third of all applicants went from being interrupted in their applications to completing them without interruption. The change was implemented without degrading spoof detection performance, i.e. without an increase in the APCER.

Single-frame is a preferred approach to liveness detection to improve security while avoiding user friction, abandonments, uncertainty, and lost revenue

It’s generally understood that with biometrics come an inherent tradeoff between false negatives and false positives that stakeholders need to factor in when designing a system. The friction introduced by an active approach will tend to result in a higher BPCER for a given target APCER. Furthermore, the unpredictability of human behavior makes it difficult to extrapolate performance in a controlled setting to real-world operations. With each new banking customer adding thousands of dollars of value to a bank, the difference made by friction can have a big financial impact.

Core Voice Biometrics

Packaged Voice Biometric Solutions

Single-Frame Facial Liveness Detection

How It Works to Reduce User Friction and Abandonment

Single-frame is a preferred approach to liveness detection to improve security while avoiding user friction, abandonments, uncertainty, and lost revenue