What is Voice
Biometrics

How to leverage voice
authentication and verification

We are often asked how do voice biometrics work. Voice biometrics is the science of using a person’s voice as a uniquely identifying biological characteristic in order to authenticate them. Also referred to as voice verification or speaker recognition, voice biometrics enables fast, frictionless and highly secure access for a range of use cases from call center, mobile and online applications to chatbots, IoT devices and physical access.

Massive advances in neural networks over the past 2-3 years have enabled the development of voice biometric algorithms that are faster, more accurate, and can authenticate users with a smaller amount of speech. In fact, ID R&D is now able to exceed the accuracy of a 4-digit PIN in many use cases.

How does voice authentication work? Like other biometric modalities, voice biometrics offer significant security advantages over authentication methods that are based on something you know (like a password or answer to a “secret” question) or something you have (like your mobile phone). Voice biometrics also improves the customer experience by removing frustration associated cumbersome login processes and lost and stolen credentials.

Voice Biometric Advantages

Enhance the customer experience with fast, frictionless authentication

Improve security and minimize breaches due to compromised passwords, phishing, etc.

Reduce threats by identifying known fraudsters.

Instantly identify users and personalize the interaction.

Free agents from time spent verifying users and resetting passwords.

Enable natural login for digital channels, including chatbots and virtual assistants.

Use as part of a two-factor authentication process to increase security without adding effort.

How does Voice Authentication work?

There are over 70 body parts– each with a unique size and shape – that contribute to how a person speaks. Voice biometrics relies on the fact that human voice characteristics correlate strongly to the physiological qualities of how a person creates speech. Unlike other methods of authentication, voice biometrics does not rely on a secret such as the person remembering a passphrase. It isn’t what the person says that is being authenticated, it’s who is speaking.

More than 70 body parts contribute to how a person produces speech and each of those parts is unique to them. Voice biometric systems work by extracting the characteristics that distinguish a person’s speech from other people. The result is a “voiceprint” analogous to a fingerprint. A voiceprint is also called a “voice template.”

Voice recognition systems enroll a known person by creating an initial template, often merging several templates from samples of that person’s speech for higher accuracy. The initial template is called the enrollment template or enrollment voiceprint.

How does Voice Biometric enrollment work?

To verify an enrolled person’s identity, the biometric voice recognition system captures a new speech sample, creates a template from the sample, and compares it against the enrollment template. A strong match between templates indicates that the same person spoke both samples, thus verifying the person’s identity. This manner of using voice recognition is called Speaker Verification. It is a one-to-one match between the enrollment template and someone claiming to be the enrolled person.

How does Voice Biometric matching work?

Another way to use voice recognition is to compare a voice sample from an unknown identity against multiple enrollment templates. The goal is to find the person within the set of enrollment templates. This manner of using voice biometrics is called Speaker Identification. There are significant limits to accuracy for Speaker Identification, so businesses should consult with an expert to understand if a one-to-many use case with voice will be practical.

The use of voice biometrics for authentication is increasing in popularity due to improvements in accuracy, fueled largely by advances in AI, and heightened customer expectations for easy and fast access to information. Frequent password-associated data breaches are another reason for broader adoption as companies look for ways to better protect customer data.

When it comes to accuracy, it’s not just about keeping the wrong person out. Companies also have to minimize “false rejects” that cause headaches for existing customers and agents. “Equal Error Rate” (EER) s is the point where the number of false accepts and false rejects is equal. Of course the goal is to make both of these error types extremely small, ideally not allowing any impostors through with only a negligible number of valid people getting rejected.

Types of Voice Authentication

Voice authentication can be accomplished using text-dependent speaker recognition or text-independent voice recognition biometrics.

Text dependent voice verification is where a person speaks a specific passphrase, usually consisting of two to three words, like “My voice is my passphrase.”
Learn More

Text independent voice verification is a passive voice biometric approach whereby the user can say anything, enabling authentication to quickly happen in the background during their normal interaction with an agent, IVR, or application.

Learn More

About IDVoice

IDVoice by ID R&D is a robust AI-driven biometric voice recognition engine that provides both text dependent and text independent voice verification for mobile, web and telephone channels, as well as physical access and IoT device integration. The product is built on an innovative Convolutional Neural Network and advanced modified x-vector approach for feature extraction technology for unmatched accuracy and is ranked #1 in the industry’s leading benchmark challenge.

IDVoice is language independent, works with ultra-short utterances and has the smallest footprint available. Download the IDVoice product collateral or visit our IDVoice Text Dependent Verification and IDVoice Text Independent Verification pages to learn more.

Threats to Voice Verification systems

While voice biometrics offers a secure way to authenticate users, it is not immune to threats. Advances in machine learning, recording technology and synthetic speech are enabling high quality voice spoofing, or voice “deepfakes” that are capable of tricking humans and voice biometrics systems into thinking they are hearing a real person. These attacks can be used to gain unauthorized access to accounts.

Combatting voice spoofing requires liveness detection technology, capable of distinguishing between a live voice and a recorded, synthetic or computer generated version of the voice. You can learn more about voice anti-spoofing here.

Want to learn more?

Unlike other solutions, ID R&D’s core voice authentication technology works in any language without retraining, works across channels with a calibration setting, and is designed from the beginning to be noise-tolerant. Ready to learn more about our voice authentication solutions?

Core Voice Biometrics

Packaged Voice Biometric Solutions

What is Voice
Biometrics

How to leverage voice
authentication and verification

Voice Biometric Advantages

How does Voice Authentication work?

How does Voice Biometric enrollment work?

How does Voice Biometric matching work?

Types of Voice Authentication

About IDVoice

Threats to Voice Verification systems

Want to learn more?

Core Voice Biometrics

Packaged Voice Biometric Solutions

What is Voice Biometrics

How to leverage voice authentication and verification

Voice Biometric Advantages

How does Voice Authentication work?

How does Voice Biometric enrollment work?

How does Voice Biometric matching work?

Types of Voice Authentication

About IDVoice

Threats to Voice Verification systems

Want to learn more?

What is Voice
Biometrics

How to leverage voice
authentication and verification