Training and inferencing using a neural network to predict orientations of objects in images

US12266144B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12266144-B2
Application numberUS-201916690015-A
CountryUS
Kind codeB2
Filing dateNov 20, 2019
Priority dateNov 20, 2019
Publication dateApr 1, 2025
Grant dateApr 1, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor, comprising: one or more circuits to help train one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation. 2. The processor of claim 1 , wherein the one or more circuits are to help train the one or more neural networks on a collection of images of a same category as the image. 3. The processor of claim 2 , wherein ground truth annotations are unavailable in at least a portion of the collection of images. 4. The processor of claim 1 , wherein the one or more characteristics of the object include symmetric consistency between the image of the object and a flipped image of the object. 5. The processor of claim 1 , wherein the one or more circuits are to help train the one or more neural networks to generate a second image of the object having a second orientation. 6. The processor of claim 1 , wherein the object's orientation is encoded on a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. 7. A system, comprising: one or more processors to calculate parameters to help train one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation; and one or more memories to store the parameters. 8. The system of claim 7 , wherein the one or more processors to calculate the parameters to help train the one or more neural networks are to help train the one or more neural networks on a collection of images of different objects of a same category as the object. 9. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: obtaining an input image; using a discriminator to determine at least a predicted viewpoint and a predicted set of appearance parameters; using a generator to create a synthetic image based, at least in part, on the predicted viewpoint and the predicted set of appearance parameters; and computing a viewpoint consistency loss based, at least in part, on the input image and the synthetic image. 10. The system of claim 9 , wherein the input image is a real image. 11. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: obtaining a first viewpoint and a first set of appearance parameters; using a generator to create a synthetic image based, at least in part, on the first viewpoint and the first set of appearance parameters; using a discriminator to predict, based, at least in part, on the synthetic image, a second viewpoint and a second set of appearance parameters; computing a viewpoint consistency loss based, at least in part, on the first viewpoint and the second viewpoint; and computing a reconstruction loss based, at least in part, on the image and the synthetic image. 12. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: using a generator to create a first synthetic image based, at least in part, on a first viewpoint and a set of appearance parameters; performing a transform on the first viewpoint to obtain a second viewpoint; using the generator to create a second synthetic image based, at least in part, on the second viewpoint and the set of appearance parameters; and computing a symmetry loss based, at least in part, on the first synthetic image and the second synthetic image. 13. The system of claim 12 , wherein the transform flips the first viewpoint horizontally to obtain the second viewpoint. 14. A method, comprising: training one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation. 15. The method of claim 14 , wherein training the one or more neural networks comprises training the one or more neural networks in a self-supervised manner on a collection of images of different objects of a same category as the object within the image. 16. The method of claim 15 , wherein training the one or more neural networks in the self-supervised manner comprises using a set of loss functions to evaluate the one or more characteristics of the object other than the object's orientation. 17. The method of claim 15 , wherein the object is of a first category and the method further comprises training the one or more neural networks to identify a second orientation of a second object using a second collection of images, wherein: the second object is of a second category different from the first category; and the second collection of images is of objects of the second category different from the second object. 18. The method of claim 15 , wherein training the one or more neural networks in the self-supervised manner comprises training the one or more neural network to at least: obtain an input image; use a discriminator to predict, from the input image, a viewpoint and a set of parameters; use a generator to create a synthetic image based, at least in part, on the viewpoint and the set of parameters; and compute one or more gradients and update parameters of the discriminator based, at least in part, on the synthetic image. 19. The method of claim 18 , wherein the generator is a deep generative model. 20. The method of claim 19 , wherein the deep generative model is a Tenderer, variational autoencoder, or generative adversarial network (GAN). 21. The method of claim 14 , wherein the object is a vehicle. 22. A processor, comprising: one or more circuits to identify one or more orientations of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's one or more orientations. 23. The processor of claim 22 , wherein the one or more circuits are to train one or more neural networks to identify the one or more orientations of the object within the image. 24. The processor of claim 23 , wherein the one or more neural networks are trained on a collection of images of different objects of a same category as the object. 25. The processor of claim 24 , wherein ground truth annotations are unavailable in the collection of images. 26. The processor of claim 22 , wherein the one or more characteristics of the object includes-symmetric consistency between the image of the object and a flipped image of the object. 27. The processor of claim 22 , wherein the object's one or more orientations are encoded on a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. 28. A system, comprising: one or more memories; and one or more processors to identify one or more orientations of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's one or more orientations. 29. The system of claim 28 , wherein the one or more processors are to train one or more neural networks to identify the one or more orientations of the object within the image based, at least in part, on the one or more

Assignees

Inventors

Classifications

  • Adversarial learning · CPC title

  • Generative networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • exterior to a vehicle by using sensors mounted on the vehicle · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12266144B2 cover?
Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/242. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 01 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).