Text Dependent and Text Independent Voice Verification

Text Dependent vs Text Independent Voice Verification

Voice biometrics is the science of using a person’s voice as a uniquely identifying characteristic. One of the two ways we put voice biometrics into practice is called Text Independent Voice Verification. The other is Text Dependent Voice Verification. IDVoice offers both options depending on your use cases. Sometimes, they are even used together.

Text independent speaker identification technology does not depend on the person speaking a particular passphrase. This mode of voice biometrics uses conversational speech. A typical text independent speaker identification use case is authenticating a caller while they are speaking with a call center agent. Another emerging use case is authenticating a user in a voice chat.

About IDVoice™

ID R&D’s core voice verification capability, IDVoice™, incorporates industry-leading research, deep expertise in speech biometrics and artificial intelligence to deliver a robust voice biometrics engine with unmatched functionality and accuracy. Features include advanced voice activity detection for improved speech processing and speaker diarization for isolation of a specific speaker’s audio stream.

The science behind IDVoice

IDVoice combines proprietary x-vector technology, deep neural networks (DNN), convolutional neural networks (CNN), and ID R&D’s patented p-vector technology to achieve exceptional performance. The text independent speaker identification product supports ultra short utterances, requiring just 1.5 to 2 seconds of speech. Low error rates are consistently achieved even with far-field microphones and in noisy environments.

The Equal Error Rate for IDVoice™ 2.0 against an industry-standard microphone text independent (free-form speech) database is 0.30% and 0.68% on a telephone channel.
Voice Verification Performance

IDVoice™ Features

  • Use with any language – no retraining necessary
  • Text-dependent, text-independent modes
  • Fast voice activity detection and speech endpoint detection for higher accuracy
  • Industry-leading matching algorithms for enrollment, verification and identification
  • Accurate signal quality estimation including signal-to-noise ratio and net speech length estimation to ensure valid speech for enrollment and verification
  • Configurable software footprint – from 2MB on embedded systems to highly accurate and faster mobile and server-based packages
  • Cross-channel compatibility using calibration setting, no retraining and no re-enrollments when using across applications and devices
  • FIDO-compatible SDK for mobile devices
  • Robust documentation and code examples
  • Python and Java desktop interfaces for server deployments

Voice Biometrics Benefits

  • Checkmark
    Minimize the use of weak passwords and strengthen security
  • Checkmark
    Reduce user frustration with password resets, obscure security questions and security codes
  • Checkmark
    Improve contact center efficiency and reduce handling times; reduce fraud
  • Checkmark
    Use with passive voice liveness to combat spoofing attempts
  • Checkmark
    Add facial or behavioral biometrics to further improve security and minimize false acceptance and rejection rates
  • Checkmark
    Add authentication to conversational interfaces like chatbots and home assistants

Getting started with IDVoice

IDVoice is delivered as an SDK for easy integration with your existing applications or product offerings. The text independent speaker identification software can be combined with our voice liveness technology, behavioral biometrics and facial liveness to address a wide range of use cases. Deploy IDVoice on mobile clients, servers, private clouds, and embedded systems. ID R&D supports Linux, Windows, iOS, and Android platforms, and is also available as a Docker image.

Want to learn more?

Unlike other solutions, ID R&D’s core voice verification technology works in any language without retraining, works across channels with a calibration setting, and is designed from the beginning to be noise-tolerant. Ready to learn about voice call authentication and more?