What technology area does this patent fall under?

Primary CPC classification G06T11/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Methods and systems for using compact object image data to construct a machine learning model for pose estimation of an object

US12373995B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12373995-B2
Application number	US-202318221032-A
Country	US
Kind code	B2
Filing date	Jul 12, 2023
Priority date	Jul 12, 2023
Publication date	Jul 29, 2025
Grant date	Jul 29, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An illustrative model construction system may access object image data representative of one or more images depicting an object having a plurality of labeled keypoint features. Based on the object image dataset, the model construction system may generate a training target dataset including a plurality of training target images. Each training target image may be generated by selecting a background image distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image with the labeled keypoint features. Based on this training target dataset, the model construction system may train a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. Corresponding methods and systems are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: accessing, by a model construction system, object image data representative of one or more images depicting an object having a plurality of labeled keypoint features; generating, by the model construction system and based on the object image data, a training target dataset including a plurality of training target images, wherein a particular training target image of the plurality of training target images is generated by: selecting a background image from a set of background images distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image together with an indication of labeled keypoint features for the manipulated depiction; and training, by the model construction system based on the training target dataset, a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. 2. The method of claim 1 , wherein the manipulating of the depiction of the object includes performing at least one of: a rotation operation to rotate the depiction of the object with respect to two or three spatial dimensions; or a scaling operation to change a size of the depiction of the object. 3. The method of claim 1 , wherein the manipulating of the depiction of the object includes performing a cropping operation to simulate an occlusion of a portion of the depiction of the object. 4. The method of claim 1 , wherein: the set of background images corresponds to a background image library configured for use in generating training target datasets for a variety of objects including the object and other objects; and the background image library comprises background images that are captured independently from image capture associated with the object. 5. The method of claim 1 , wherein: the particular training target image is further generated by applying, subsequent to the overlaying of the manipulated depiction of the object onto the selected background image, an image processing operation to the particular training target image; and the image processing operation changes at least one of a contrast attribute, a color attribute, or a saturation attribute of the particular training target image. 6. The method of claim 1 , further comprising: recognizing, based on the machine learning model, the object as depicted within a particular input image captured by an augmented reality presentation device; and providing, based on the machine learning model, an estimate of the pose of the object for use by the augmented reality presentation device in an augmented reality application associated with the object. 7. The method of claim 6 , wherein: the augmented reality application is configured to assist a user in performing a particular action with respect to the object; and the estimate of the pose of the object is used by the augmented reality application to augment the particular input image with integrated guidance information for the performing of the particular action. 8. The method of claim 1 , wherein the accessing of the object image data includes: receiving a set of preliminary images depicting the object from a plurality of different vantage points; and generating the object image data based on the set of preliminary images by: cropping each preliminary image around the object, manipulating each preliminary image to simulate a straight-on vantage point of the object, and automatically labeling, or providing a user interface for manually labeling, keypoint features of the object within each preliminary image subsequent to the cropping and manipulating of the preliminary image. 9. The method of claim 1 , wherein: the plurality of training target images are randomized training target images; and the particular training target image is generated using one or more random or pseudorandom values to perform: the selecting of the background image from the set of background images; the manipulating of the depiction of the object; and the overlaying of the manipulated depiction of the object onto the selected background image. 10. The method of claim 1 , wherein: the particular training target image is further generated by manipulating an additional depiction of the object represented within the object image data; and overlaying the additional manipulated depiction of the object onto the selected background image together with the manipulated depiction of the object. 11. The method of claim 1 , wherein the machine learning model is implemented by a convolutional neural network that includes: a backbone component configured to progressively process the input images using a series of convolutional layers; and an anchor-based model head component configured to designate a plurality of anchor areas within the input images and to search each of the plurality of anchor areas for an instance of the object. 12. The method of claim 1 , wherein the machine learning model is implemented by a convolutional neural network that includes: a backbone component configured to progressively process the input images using a series of convolutional layers; and a segmentation-based model head component configured to semantically segment the input images to differentiate instances of the object from other image content depicted in the input images. 13. The method of claim 1 , wherein: the object is a switchboard that includes a plurality of switchboard panels; and the machine learning model is trained to recognize the object in two phases including: a first phase in which a prospective recognition of the switchboard is performed based on a subset of keypoint features identified within the input images, and a second phase in which a confirmed recognition of the switchboard is performed based on the prospective recognition of the first phase and based on respective full sets of keypoint features identified within the input images for each of the plurality of switchboard panels. 14. A system comprising: a memory storing instructions; and one or more processors communicatively coupled to the memory and configured to execute the instructions to perform a process comprising: accessing object image data representative of one or more images depicting an object having a plurality of labeled keypoint features; generating, based on the object image data, a training target dataset including a plurality of training target images, wherein a particular training target image of the plurality of training target images is generated by: selecting a background image from a set of background images distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image together with an indication of labeled keypoint features for the manipulated depiction; and training, based on the training target dataset, a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. 15. The system of claim 14 , wherein the manipulating of the depiction of the object includes performing at least one of: a rotation operation to rotate the depiction of the object with respect to two or three spatial dimensions; or a scaling operation to change a size of the depiction of the object. 16. The system

Assignees

Verizon Patent & Licensing Inc

Inventors

Classifications

G06T3/60
Rotation of whole images or parts thereof · CPC title
G06T2207/20132
Image cropping · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T19/006
Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

View patent family 94211498

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12373995B2 cover?: An illustrative model construction system may access object image data representative of one or more images depicting an object having a plurality of labeled keypoint features. Based on the object image dataset, the model construction system may generate a training target dataset including a plurality of training target images. Each training target image may be generated by selecting a backgrou…
Who is the assignee on this patent?: Verizon Patent & Licensing Inc
What technology area does this patent fall under?: Primary CPC classification G06T11/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Generating and visualizing planar surfaces within a three-dimensional space for modifying objects in a two-dimensional editing interface

Tracking and 3d reconstruction of unknown objects

3d pose estimation in robotics

Systems and methods for refined object estimation from image data

Techniques to place objects using neural networks

Segmenting objects in digital images utilizing a multi-object segmentation model framework

Method and apparatus with object pose estimation

Frequently asked questions