What Is Document Spoofing?

The integrity of digital onboarding processes is threatened when bad actors can easily use spoofed identity documents.

The advancement of mobile computing has had a remarkable impact on so many aspects of our lives, with convenience being a common theme. An application that has seen great advances is the vetting and onboarding of new customers; our devices now allow us to apply for accounts remotely by leveraging the high-quality cameras and powerful CPUs that are ubiquitous in our mobile phones and laptops. It’s yet another case of technology, and AI in particular, helping to automate business functions that are otherwise prone to human error. 

Digital banking means fewer branch visits, if any

Not long ago, prospective banking customers needed to visit a local branch to apply for an account. For some this is a mild inconvenience, but it can be a more onerous limitation when a trip to the nearest bank branch is an all-day affair; it’s not an uncommon situation in many parts of the world. Plus, an in-person application requirement has the effect of limiting the addressable market to those consumers with access to a local, physical retail presence. Making financial services more accessible is a win-win for everyone; a broader variety of financial services and the innovations that come from competition, bank accounts for people that need them regardless of where they live, and more customers for banks. 

But there will always be bad actors looking to exploit perceived security gaps. Intentionally misrepresenting identity is one way that fraudsters attempt to steal from banks and their customers with impunity; without the accountability that identity proofing brings to bear. The application and onboarding process is a particularly vulnerable moment in the customer lifecycle because there is no history of the applicant to inform their risk profile. Their identity needs to be presented and verified before whatever information about their financial or criminal history can be effectively used towards assessing what levels of service are appropriate. 

In the case of a visit to a retail branch, an applicant presents a government-issued identity document, which a bank employee then uses to visually confirm that 1) the person standing before them is in fact the same person that was issued the ID, and that 2) the ID was genuine and not altered in any way. The process relies on three things: a live person with their ID in their possession, the biographic information on their ID, and the facial portrait on their ID. 

Digital onboarding requires digital presentation of documents

Remote onboarding lets the applicant skip the visit to the branch, but requires them to participate in this identity verification process using their mobile device or computer. It relies on the same three things: 1) a live person in possession of their genuine ID, 2) the biographic information on their ID, and 3) the portrait of their face on the ID. But in a remote process, these items must be presented, captured, and then submitted as digital photographs taken using their devices. 

This is where “document spoof detection” comes into play. In a bank branch, a human bank teller can easily recognize the “liveness” of a person and their documents; whether a person or an object is actually present. They might also be able to recognize that a person is the same as the portrait on their ID, though success rates here can vary widely with a number of factors. An even more skilled person can recognize a fake ID. 

Just as with biometric facial images, identity documents can be spoofed

Consider the process without spoof detection. A bad actor could simply present images of facial images and IDs–displayed on a screen or printed on paper–instead of a live person and a live document. These are called “presentation attacks”. It would be relatively easy to conduct large-scale fraud, submitting hundreds or even thousands of account applications using photos of a person and their purported ID, instead of a live person with a genuine ID in their possession. 

Replicating millions of years of human brain evolution is a complicated task for software and digital cameras, particularly given the powerful tools that a fraudster might have at hand to try to spoof the system. But AI has proven a powerful approach to classifying when people and their documents are live and present and when they’re not. Ideally we leverage algorithms and software in a way that is fully transparent to the user to do so, and importantly does not provide any information about whatever countermeasures might be in place, how they work, and how they might be defeated. 

In the case of documents, there are a number of ways that a fraudster might try to present a spoof, and the sophistication of their approach can vary widely. While it is virtually impossible to prevent 100% of the most advanced spoof attempts, the more practical goal is to make it prohibitively risky, tedious, and expensive for career criminals–just as in the case of supervised in-person processes, which while not foolproof, make it as difficult and risky for bad actors to stage scalable and repeatable attacks. 

Defining and classifying some document spoof attacks

Document-based presentation attacks come in a variety of sophistication levels based on the quality of imaging and printing equipment and fabrication time. The following diagram illustrates a selection of attack methods, and it’s followed by some definitions to help sort out and classify the different approaches.

Figure: Categorization of document spoof attacks

Digital screen replay – A document presented on a digital screen, e.g. a mobile phone, LCD computer monitor or display, or a high-resolution display

Paper-printed copy – A reproduction of a document printed on paper, produced by any of a variety of imaging and printing technologies, e.g. copier, scanner, digital camera, etc.  

Portrait substitution – A spoof method where the facial portrait is altered and not the data region; ie the data region is an original, genuine document

Overlay – A portrait alteration that uses a photo pasted over the genuine portrait on the genuine data region

Full-page copy – A spoof of a document that includes a reproduction of the entire page of the document or card, including the portrait and the biographic data section 

Low scan/print quality – A spoof of a document scanned and printed using a low-fidelity device such as a copier machine 

High scan/print quality – A spoof of a document generated using a high-resolution image scanner or digital camera and printed with a high-quality color printer

Laminated – A paper-printed document reproduction that uses a laminated portrait and/or data region over the genuine, original document, with a surface that more closely resembles a genuine document and is potentially harder to detect

Cutout – A copied or printed image spoof of a full page of a document that is cut and pasted into a genuine document