What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Deep learning stack used in production to prevent exfiltration of image-borne identification documents

US11574151B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11574151-B2
Application number	US-202117229768-A
Country	US
Kind code	B2
Filing date	Apr 13, 2021
Priority date	Jun 3, 2020
Publication date	Feb 7, 2023
Grant date	Feb 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is detecting identification documents in image-borne identification documents and protecting against loss of the image-borne identification documents. A trained deep learning (DL) stack is used to classify production images by inference as containing a sensitive image-borne identification document, with the trained stack configured with parameters determined using labelled ground truth data for the identification documents and examples of other image documents. The trained DL stack is configured to include a first set of layers closer to an input layer and a second set of layers further from the input layer, with the first set pre-trained to perform image recognition before exposing the second set of layers of the stack to the labelled ground truth data for the image-borne identification documents and examples of other image documents, and using the inferred classification of the sensitive image-borne identification document in a DLP system to protect against loss by image exfiltration.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of detecting identification documents in images, referred to as image-borne identification documents, and protecting against loss by image exfiltration of the identification documents, including: using a trained deep learning (abbreviated DL) stack to classify at least one production image by inference as containing a sensitive image-borne identification document; wherein the trained production DL stack is configured with parameters determined using labelled ground truth data for the image-borne identification documents and examples of other image documents; and wherein the trained DL stack is further configured to include at least a first set of layers closer to an input layer and a second set of layers further from the input layer, wherein the first set of layers were pre-trained to perform image recognition before exposing the second set of layers of the DL stack to the labelled ground truth data for the image-borne identification documents and examples of other image documents; and using the inferred classification of the sensitive image-borne identification document in a document loss prevention system to protect against loss by image exfiltration. 2. The method of claim 1 , further including for private image-borne identification documents, capturing features produced as output from the first set of layers and retaining the captured features together with respective ground truth labels, thereby eliminating any need to retain images of the private image-borne identification documents. 3. The method of claim 1 , further including restricting training by backward propagation using the labelled ground truth data for the image-borne identification documents and the examples of other image documents to training of parameters in the second set of layers. 4. The method of claim 1 , wherein optical character recognition (abbreviated OCR) analysis of images is applied to label the images as identification documents or non-identification documents. 5. The method of claim 1 , wherein a first set of the image-borne identification documents is distorted in perspective to produce a second set of the image-borne identification documents and combining the first and second sets with the labelled ground truth data when training the DL stack by back propagation. 6. The method of claim 1 , wherein a first set of the image-borne identification documents is distorted by noise to produce a third set of the image-borne identification documents and combining the first and third sets of the image-borne identification documents with the labelled ground truth data when training the DL stack by back propagation. 7. The method of claim 1 , wherein a first set of the image-borne identification documents is distorted in focus to produce a fourth set of the image-borne identification documents and combining the first and fourth sets of the image-borne identification documents with the labelled ground truth data when training the DL stack by back propagation. 8. A tangible non-transitory computer readable storage media, including program instructions loaded into memory that, when executed on processors, cause the processors to implement a method of detecting identification documents in images, referred to as image-borne identification documents, and protecting against loss of the image-borne identification documents, the method including: using a trained deep learning (abbreviated DL) stack to classify at least one production image by inference as containing a sensitive image-borne identification document; wherein the trained production DL stack is configured with parameters determined using labelled ground truth data for the image-borne identification documents and examples of other image documents; and wherein the trained DL stack is further configured to include at least a first set of layers closer to an input layer and a second set of layers further from the input layer, wherein the first set of layers were pre-trained to perform image recognition before exposing the second set of layers of the DL stack to the labelled ground truth data for the image-borne identification documents and examples of other image documents; and using the inferred classification of the sensitive image-borne identification document in a document loss prevention system to protect against loss by image exfiltration. 9. The tangible non-transitory computer readable storage media of claim 8 , further including for private image-borne identification documents, capturing features produced as output from the first set of layers and retaining the captured features together with respective ground truth labels, thereby eliminating any need to retain images of the private image-borne identification documents. 10. The tangible non-transitory computer readable storage media of claim 8 , further including restricting training by backward propagation using the labelled ground truth data for the image-borne identification documents and the examples of other image documents to training of parameters in the second set of layers. 11. The tangible non-transitory computer readable storage media of claim 8 , wherein optical character recognition (abbreviated OCR) analysis of images is applied to label the images as identification documents or non-identification documents. 12. The tangible non-transitory computer readable storage media of claim 8 , wherein a first set of the image-borne identification documents is distorted in perspective to produce a second set of the image-borne identification documents and combining the first and second sets with the labelled ground truth data when training the DL stack by back propagation. 13. The tangible non-transitory computer readable storage media of claim 8 , wherein a first set of the image-borne identification documents is distorted by noise to produce a third set of the image-borne identification documents and combining the first and third sets of the image-borne identification documents with the labelled ground truth data when training the DL stack by back propagation. 14. A system for detecting identification documents in images, referred to as image-borne identification documents, and protecting against loss of the image-borne identification documents, the system including a processor, memory coupled to the processor, and computer instructions from the non-transitory computer readable storage media of claim 8 loaded into the memory. 15. The system of claim 14 , further including for private image-borne identification documents, capturing features produced as output from the first set of layers and retaining the captured features together with respective ground truth labels, thereby eliminating any need to retain images of the private image-borne identification documents. 16. The system of claim 14 , further including restricting training by backward propagation using the labelled ground truth data for the image-borne identification documents and the examples of other image documents to training of parameters in the second set of layers. 17. The system of claim 14 , wherein optical character recognition (abbreviated OCR) analysis of images is applied to label the images as identification documents or non-identification documents. 18. The system of claim 14 , wherein a first set of the image-borne identification documents is distorted in perspective to produce a second set of the image-borne identification documents and combining the first and second sets with the labelled ground truth data when training the DL stack by back propagation.

Assignees

Netskope Inc

Inventors

Classifications

G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/096
Transfer learning · CPC title
G06N3/098
Distributed learning, e.g. federated learning · CPC title
G06V30/40
Document-oriented image-based pattern recognition · CPC title

Patent family

Related publications grouped by family.

View patent family 75587549

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11574151B2 cover?: Disclosed is detecting identification documents in image-borne identification documents and protecting against loss of the image-borne identification documents. A trained deep learning (DL) stack is used to classify production images by inference as containing a sensitive image-borne identification document, with the trained stack configured with parameters determined using labelled ground trut…
Who is the assignee on this patent?: Netskope Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).