Generating a document assembly object and derived checks

US12499703B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12499703-B2
Application numberUS-202318193669-A
CountryUS
Kind codeB2
Filing dateMar 31, 2023
Priority dateDec 30, 2022
Publication dateDec 16, 2025
Grant dateDec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure includes a system and method for obtaining a document specification in an electronic format, wherein the document specification is associated with a first document, and describes features present in valid instances of the first document; determining a set of labels describing the first document from the document specification; obtaining one or more digital images of at least one valid instance of the first document from the document specification; obtaining information describing a set of bounding boxes resulting from application, to the one or more images of the least one valid instance of the first document, of one or more of optical character recognition and object detection; generating a set of derived checks based on the set of bounding boxes; and generating a document assembly object describing valid instances of the document and the set of derived checks usable to determine validity of a document under test.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining, using one or more processors, a document specification in an electronic format, wherein the document specification is associated with a first document, and describes features present in valid instances of the first document; determining, using the one or more processors, a set of labels describing the first document from the document specification; obtaining, using the one or more processors, one or more digital images of at least one valid instance of the first document from the document specification; obtaining, using the one or more processors, information describing a set of bounding boxes, the set of bounding boxes resulting from one or more of optical character recognition and object detection, the one or more of the optical character recognition and the object detection applied to the one or more images of the at least one valid instance of the first document; generating, using the one or more processors, a set of derived validity checks based on the set of bounding boxes; and generating, using the one or more processors, a document assembly object describing valid instances of the first document and the set of derived validity checks usable to determine whether an image of a document under test represents a valid instance of the first document. 2 . The method of claim 1 , the method further comprising: obtaining a set of test images representing multiple instances of the first document, the set of test images including a first test image; determining, based on a first derived validity check in the document assembly object, whether each test image in the set of test images is valid with respect to the first derived validity check or invalid with respect to the first derived validity check; and adjusting how subsequent determinations are made based on a presence of a false positive or false negative in a determination of the first test image with respect to the first derived validity check. 3 . The method of claim 1 , wherein adjusting how subsequent determinations are made includes one or more of: retraining a machine learning model associated with a first derived validity check to reduce an instance of a false positive or a false negative; and adjusting a tolerance. 4 . The method of claim 1 , the method further comprising: obtaining a set of valid document images, wherein each image in the set of valid document images represents a valid instance of the first document; applying pattern recognition to the set of valid document images; generating, based on a first detected pattern, a newly derived validity check; and adding the newly derived validity check to the document assembly object. 5 . The method of claim 4 , wherein the newly derived validity check is associated with an unpublished security feature present in the first document. 6 . The method of claim 4 , wherein the pattern recognition identifies a repetition in at least a portion of personally identifiable information (PII) text between two or more bounding boxes associated with a common, valid document instance in the set of valid document images, and wherein the newly derived validity check, when applied to the image of the document under test, checks for one or more of: whether a bounding box, which is associated with at least a partial repetition of PII in valid instances of the first document, is present in the document under test; whether the bounding box, which is associated with at least a partial repetition of PII in valid instances of the first document, in the document under test is in a location consistent with valid instances of the first document; and whether text content of the bounding box repeats a portion of PII text found elsewhere in the document under test that is consistent with valid instances of the first document. 7 . The method of claim 1 , wherein the set of bounding boxes includes a first bounding box that is associated with a ghost image. 8 . The method of claim 1 , wherein the set of bounding boxes includes a first bounding box that is associated with at least a partial repetition of PII in valid instances of the first document, is undiscernible to an average human eye absent magnification. 9 . The method of claim 1 , wherein the electronic format is one of hypertext markup language and printable document format and published by a trusted source. 10 . The method of claim 1 , wherein the document assembly object is human and machine readable. 11 . A system comprising: a processor; and a memory, the memory storing instructions that, when executed by the processor, cause the system to: obtain a document specification in an electronic format, wherein the document specification is associated with a first document, and describes features present in valid instances of the first document; determine a set of labels describing the first document from the document specification; obtain one or more digital images of at least one valid instance of the first document from the document specification; obtain information describing a set of bounding boxes, the set of bounding boxes resulting from one or more of optical character recognition and object detection, the one or more of the optical character recognition and the object detection applied to the one or more images of the at least one valid instance of the first document; generate a set of derived validity checks based on the set of bounding boxes; and generate a document assembly object describing valid instances of the first document and the set of derived validity checks usable to determine whether an image of a document under test represents a valid instance of the first document. 12 . The system of claim 11 , wherein the instructions, when executed, cause the system to: obtain a set of test images representing multiple instances of the first document, the set of test images including a first test image; determine, based on a first derived validity check in the document assembly object, whether each image in the set of test images is valid with respect to the first derived validity check or invalid with respect to the first derived validity check; and adjust how subsequent determinations are made based on a presence of a false positive or false negative in a determination of the first test image with respect to the first derived validity check. 13 . The system of claim 11 , wherein adjusting how subsequent determinations are made includes one or more of: retraining a machine learning model associated with a first derived validity check to reduce an instance of a false positive or a false negative; and adjusting a tolerance. 14 . The system of claim 11 , wherein the instructions, when executed, cause the system to: obtain a set of valid document images, wherein each image in the set of valid document images represents a valid instance of the first document; apply pattern recognition to the set of valid document images; generate, based on a first detected pattern, a newly derived validity check; and add the newly derived validity check to the document assembly object. 15 . The system of claim 14 , wherein the newly derived validity check is associated with an unpublished security feature present in the first document. 16 . The system of claim 14 , wherein the pattern recognition identifies a repetition in at least a portion of personally identifiable information (PII) text between two or more bounding boxes associated with a common, valid document instance in the set of valid document images, and wherein the newly derived v

Assignees

Inventors

Classifications

  • Validation; Performance evaluation · CPC title

  • Classification, e.g. identification · CPC title

  • Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables · CPC title

  • Classification techniques · CPC title

  • Determination of region of interest · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499703B2 cover?
The disclosure includes a system and method for obtaining a document specification in an electronic format, wherein the document specification is associated with a first document, and describes features present in valid instances of the first document; determining a set of labels describing the first document from the document specification; obtaining one or more digital images of at least one …
Who is the assignee on this patent?
Jumio Corp
What technology area does this patent fall under?
Primary CPC classification G06V30/414. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).