Training and inferencing using a neural network to predict orientations of objects in images

US2025384647A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025384647-A1
Application numberUS-202519094621-A
CountryUS
Kind codeA1
Filing dateMar 28, 2025
Priority dateNov 20, 2019
Publication dateDec 18, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.

First claim

Opening claim text (preview).

1 . (canceled) 2 . One or more processors, comprising: circuitry to identify, by one or more neural networks, a viewpoint of a first object in an image, wherein one or more parameters of the one or more neural networks are updated based, at least in part, on one or more labels corresponding to one or more objects of one or more training images, the one or more labels indicating one or more characteristics of the one or more objects other than an orientation of the one or more objects. 3 . The one or more processors of claim 2 , wherein the one or more neural networks further identify the viewpoint based, at least in part, on a collection of images of a same category as the image. 4 . The one or more processors of claim 3 , wherein ground truth annotations are not included in at least a portion of the collection of images. 5 . The one or more processors of claim 2 , wherein the one or more characteristics of the one or more objects include symmetric consistency between the image of the first object and a flipped image of the first object. 6 . The one or more processors of claim 2 , wherein the circuitry is further configured to use the one or more neural networks to generate a second image depicting the first object at a second orientation based, at least in part, on the viewpoint identified for the first object. 7 . The one or more processors of claim 2 , wherein the viewpoint of the first object is encoded on a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. 8 . A system, comprising: one or more processors to identify, by one or more neural networks, a viewpoint of a first object in an image, the one or more neural networks trained at least by: generating a loss value based, at least in part, on a feature similarity between a characteristic of the first object and a corresponding characteristic of a second object; and updating the one or more neural networks according to the loss value. 9 . The system of claim 8 , wherein the one or more processors are further configured to train the one or more neural networks using an unlabeled training dataset comprising a plurality of images of objects of a same category. 10 . The system of claim 8 , wherein the loss value is further generated based, at least in part, on an image consistency loss computed based at least on a difference between a viewpoint of the first object and a viewpoint generated by a generative model. 11 . The system of claim 8 , wherein ground truth annotations are not included in at least a portion of the training dataset used to train the one or more neural networks. 12 . The system of claim 8 , wherein the one or more processors are further configured to evaluate symmetric consistency of the first object by comparing an image of the first object with a transformed version of the same image. 13 . The system of claim 9 , wherein the one or more neural networks are further trained to infer synthetic viewpoints of the first object and the second object using a generative adversarial network (GAN). 14 . The system of claim 13 , wherein the training comprises generating synthetic images of the first object in different orientations and comparing a predicted viewpoint of the synthetic images with the predicted viewpoint of the original image. 15 . A method, comprising: identifying, by one or more neural networks, a viewpoint of a first object in an image, the one or more neural networks trained at least by: generating a loss value based, at least in part, on a feature similarity between a characteristic of the first object and a corresponding characteristic of a second object; and updating the one or more neural networks according to the loss value. 16 . The method of claim 15 , wherein the one or more neural networks are further trained using an unlabeled training dataset comprising a collection of images of objects of the same category as the first object. 17 . The method of claim 15 , wherein ground truth annotations are unavailable in at least a portion of the training dataset. 18 . The method of claim 15 , wherein the characteristic of the first object is evaluated using symmetric consistency between the image of the first object and a transformed version of the image. 19 . The method of claim 15 , wherein the training includes using a generator to create synthetic images of objects using a plurality of viewpoints, and wherein the synthetic images are evaluated to compute a viewpoint consistency loss. 20 . The method of claim 15 , wherein the training includes constructing a graph of feature similarities across the training dataset and computing nearest neighbor and farthest neighbor losses based on object viewpoints. 21 . The method of claim 15 , wherein the one or more neural networks are further trained to infer synthetic viewpoints of the first object and the second object using a GENERATIVE ADVERSARIAL NETWORK (GAN).

Assignees

Inventors

Classifications

  • exterior to a vehicle by using sensors mounted on the vehicle · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • using neural networks · CPC title

  • based on feedback from supervisors · CPC title

  • using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025384647A1 cover?
Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/242. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 18 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).