Generating and visualizing planar surfaces within a three-dimensional space for modifying objects in a two-dimensional editing interface
US-2024378832-A1 · Nov 14, 2024 · US
US12373995B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12373995-B2 |
| Application number | US-202318221032-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 12, 2023 |
| Priority date | Jul 12, 2023 |
| Publication date | Jul 29, 2025 |
| Grant date | Jul 29, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An illustrative model construction system may access object image data representative of one or more images depicting an object having a plurality of labeled keypoint features. Based on the object image dataset, the model construction system may generate a training target dataset including a plurality of training target images. Each training target image may be generated by selecting a background image distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image with the labeled keypoint features. Based on this training target dataset, the model construction system may train a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. Corresponding methods and systems are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A method comprising: accessing, by a model construction system, object image data representative of one or more images depicting an object having a plurality of labeled keypoint features; generating, by the model construction system and based on the object image data, a training target dataset including a plurality of training target images, wherein a particular training target image of the plurality of training target images is generated by: selecting a background image from a set of background images distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image together with an indication of labeled keypoint features for the manipulated depiction; and training, by the model construction system based on the training target dataset, a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. 2. The method of claim 1 , wherein the manipulating of the depiction of the object includes performing at least one of: a rotation operation to rotate the depiction of the object with respect to two or three spatial dimensions; or a scaling operation to change a size of the depiction of the object. 3. The method of claim 1 , wherein the manipulating of the depiction of the object includes performing a cropping operation to simulate an occlusion of a portion of the depiction of the object. 4. The method of claim 1 , wherein: the set of background images corresponds to a background image library configured for use in generating training target datasets for a variety of objects including the object and other objects; and the background image library comprises background images that are captured independently from image capture associated with the object. 5. The method of claim 1 , wherein: the particular training target image is further generated by applying, subsequent to the overlaying of the manipulated depiction of the object onto the selected background image, an image processing operation to the particular training target image; and the image processing operation changes at least one of a contrast attribute, a color attribute, or a saturation attribute of the particular training target image. 6. The method of claim 1 , further comprising: recognizing, based on the machine learning model, the object as depicted within a particular input image captured by an augmented reality presentation device; and providing, based on the machine learning model, an estimate of the pose of the object for use by the augmented reality presentation device in an augmented reality application associated with the object. 7. The method of claim 6 , wherein: the augmented reality application is configured to assist a user in performing a particular action with respect to the object; and the estimate of the pose of the object is used by the augmented reality application to augment the particular input image with integrated guidance information for the performing of the particular action. 8. The method of claim 1 , wherein the accessing of the object image data includes: receiving a set of preliminary images depicting the object from a plurality of different vantage points; and generating the object image data based on the set of preliminary images by: cropping each preliminary image around the object, manipulating each preliminary image to simulate a straight-on vantage point of the object, and automatically labeling, or providing a user interface for manually labeling, keypoint features of the object within each preliminary image subsequent to the cropping and manipulating of the preliminary image. 9. The method of claim 1 , wherein: the plurality of training target images are randomized training target images; and the particular training target image is generated using one or more random or pseudorandom values to perform: the selecting of the background image from the set of background images; the manipulating of the depiction of the object; and the overlaying of the manipulated depiction of the object onto the selected background image. 10. The method of claim 1 , wherein: the particular training target image is further generated by manipulating an additional depiction of the object represented within the object image data; and overlaying the additional manipulated depiction of the object onto the selected background image together with the manipulated depiction of the object. 11. The method of claim 1 , wherein the machine learning model is implemented by a convolutional neural network that includes: a backbone component configured to progressively process the input images using a series of convolutional layers; and an anchor-based model head component configured to designate a plurality of anchor areas within the input images and to search each of the plurality of anchor areas for an instance of the object. 12. The method of claim 1 , wherein the machine learning model is implemented by a convolutional neural network that includes: a backbone component configured to progressively process the input images using a series of convolutional layers; and a segmentation-based model head component configured to semantically segment the input images to differentiate instances of the object from other image content depicted in the input images. 13. The method of claim 1 , wherein: the object is a switchboard that includes a plurality of switchboard panels; and the machine learning model is trained to recognize the object in two phases including: a first phase in which a prospective recognition of the switchboard is performed based on a subset of keypoint features identified within the input images, and a second phase in which a confirmed recognition of the switchboard is performed based on the prospective recognition of the first phase and based on respective full sets of keypoint features identified within the input images for each of the plurality of switchboard panels. 14. A system comprising: a memory storing instructions; and one or more processors communicatively coupled to the memory and configured to execute the instructions to perform a process comprising: accessing object image data representative of one or more images depicting an object having a plurality of labeled keypoint features; generating, based on the object image data, a training target dataset including a plurality of training target images, wherein a particular training target image of the plurality of training target images is generated by: selecting a background image from a set of background images distinct from the object image data, manipulating a depiction of the object represented within the object image data, and overlaying the manipulated depiction of the object onto the selected background image together with an indication of labeled keypoint features for the manipulated depiction; and training, based on the training target dataset, a machine learning model to recognize and estimate a pose of the object when the object is depicted in input images analyzed using the trained machine learning model. 15. The system of claim 14 , wherein the manipulating of the depiction of the object includes performing at least one of: a rotation operation to rotate the depiction of the object with respect to two or three spatial dimensions; or a scaling operation to change a size of the depiction of the object. 16. The system
Rotation of whole images or parts thereof · CPC title
Image cropping · CPC title
Training; Learning · CPC title
Mixed reality (object pose determination, tracking or camera calibration for mixed reality G06T7/00) · CPC title
Artificial neural networks [ANN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.