Machine-learning models for image processing

US12456322B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12456322-B2
Application numberUS-202519218062-A
CountryUS
Kind codeB2
Filing dateMay 23, 2025
Priority dateApr 8, 2024
Publication dateOct 28, 2025
Grant dateOct 28, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Presented herein are systems and methods for the employment of machine learning models for image processing as may be performed by computing devices associated with an end user. A method may include obtaining video data comprising a plurality of frames including a document of a document type. The method may include executing an object recognition engine of a machine-learning architecture using image data of the plurality of frames, the object recognition engine trained to detect edges of documents. The method may include identifying, based on the edge detection, a plurality of boundaries for the document. The method may include validating, based on the plurality of boundaries, the document as the document type. The method may include transmitting via one or more networks, to a computer remote from the computing device, responsive to the validation of the type of document, the image data for the plurality of frames depicting the document.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of client-side operation validations for remote processing of document imagery, the method comprising: obtaining, by a mobile client device associated with an end-user, an operation request via a user interface of the mobile client device; obtaining, by a camera of the mobile client device, image data including a document and environment imagery around the document; executing, the mobile client device, an object recognition engine to extract a set of environment features from the environment imagery, the object recognition engine trained for detecting one or more edges of the document and detecting the set of environment features using corresponding training labels indicating expected training environment imagery; generating, by the mobile client device, an operation validation score based upon an image similarity between the set of environment features and expected environment imagery, the operation validation score indicating a likelihood that the operation request is a valid operation request according to the image similarity between the set of environment features and the expected environment imagery; and in response to determining that the operation validation score satisfies an operation validation threshold, transmitting, by the mobile client device, an operation instruction for performing the operation request to a backend server. 2 . The method according to claim 1 , further comprising generating, by the mobile client device, the operation instruction including operation information and the image data having the document. 3 . The method according to claim 1 , further comprising obtaining, by the camera of the mobile client device, video data comprising a plurality of frames, including a frame having the image data including the environment imagery around the document. 4 . The method according to claim 3 , wherein the image data of a preceding frame of the video data indicates the expected environment imagery. 5 . The method according to claim 3 , further comprising executing, the mobile client device, the object recognition engine using the video data as input to identify each frame of the plurality of frames having a portion of the image data containing the document. 6 . The method according to claim 1 , further comprising identifying, by the mobile client device, a dimension similarity between the document and a document type of the document, based upon comparing the one or more edges of the document in a set of document dimension features against a predefined set of dimension features for the document type of the document. 7 . The method according to claim 1 , further comprising executing, by the mobile client device, the object recognition engine to extract a set of content features representing content data of the document from the image data. 8 . The method according to claim 7 , further comprising identifying, by the mobile client device, a content similarity between the content data of the document as extracted from the image data and expected content data of a predefined set of content features for a document type of the document. 9 . The method according to claim 7 , further comprising identifying, by the mobile client device, a content similarity between the content data of the document as extracted from the image data and expected content data received via the user interface of the mobile client device. 10 . The method according to claim 1 , further comprising: generating, by the mobile client device, a quality score for the document using at least the set of environment features of the image data; and generating, by the mobile client device, an output indicator for display at the user interface of the mobile client device based upon based upon comparing the quality score against a quality threshold. 11 . A system for client-side operation validations for remote processing document imagery, the system comprising: a mobile client device associated with an end-user comprising at least one processor and a camera, configured to: obtain an operation request via a user interface of the mobile client device; obtain, by the camera, image data including a document and environment imagery around the document; execute an object recognition engine to extract a set of environment features from the environment imagery, the object recognition engine trained for detecting one or more edges of the document and detecting the set of environment features using corresponding training labels indicating expected training environment imagery; generate an operation validation score based upon an image similarity between the set of environment features and expected environment imagery, the operation validation score indicating a likelihood that the operation request is a valid operation request according to the image similarity between the set of environment features and the expected environment imagery; and in response to determining that the operation validation score satisfies an operation validation threshold, transmitting an operation instruction for performing the operation request to a backend server. 12 . The system according to claim 11 , wherein the mobile device is further configured to generate the operation instruction including operation information and the image data having the document. 13 . The system according to claim 11 , wherein the mobile device is further configured to obtain, by the camera, video data comprising a plurality of frames, including a frame having the image data including the environment imagery around the document. 14 . The system according to claim 13 , wherein the image data of a preceding frame of the video data indicates the expected environment imagery. 15 . The system according to claim 13 , wherein the mobile device is further configured to execute the object recognition engine using the video data as input to identify each frame of the plurality of frames having a portion of the image data containing the document. 16 . The system according to claim 11 , wherein the mobile device is further configured to identify a dimension similarity between the document and a document type of the document, based upon comparing the one or more edges of the document in a set of document dimension features against a predefined set of dimension features for the document type of the document. 17 . The system according to claim 11 , wherein the mobile device is further configured to execute the object recognition engine to extract a set of content features representing content data of the document from the image data. 18 . The system according to claim 17 , wherein the mobile device is further configured to identify a content similarity between the content data of the document as extracted from the image data and expected content data of a predefined set of content features for a document type of the document. 19 . The system according to claim 17 , wherein the mobile device is further configured to identify a content similarity between the content data of the document as extracted from the image data and expected content data received via the user interface of the mobile client device. 20 . The system according to claim 11 , wherein the mobile device is further configured to: generate a quality score for the document using at least the set of environment features of the image data; and generate an output indicator for display at the user interface of the mobile client device based upon based upon comparing the quality sc

Assignees

Inventors

Classifications

  • Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title

  • Target detection · CPC title

  • Proximity, similarity or dissimilarity measures · CPC title

  • Image quality inspection · CPC title

  • Inspection of images, e.g. flaw detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12456322B2 cover?
Presented herein are systems and methods for the employment of machine learning models for image processing as may be performed by computing devices associated with an end user. A method may include obtaining video data comprising a plurality of frames including a document of a document type. The method may include executing an object recognition engine of a machine-learning architecture using …
Who is the assignee on this patent?
Citibank Na
What technology area does this patent fall under?
Primary CPC classification G06V30/414. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 28 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).