Classifying digital images in few-shot tasks based on neural networks trained using manifold mixup regularization and self-supervision
US-2021124993-A1 · Apr 29, 2021 · US
US12266144B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12266144-B2 |
| Application number | US-201916690015-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 20, 2019 |
| Priority date | Nov 20, 2019 |
| Publication date | Apr 1, 2025 |
| Grant date | Apr 1, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.
Opening claim text (preview).
What is claimed is: 1. A processor, comprising: one or more circuits to help train one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation. 2. The processor of claim 1 , wherein the one or more circuits are to help train the one or more neural networks on a collection of images of a same category as the image. 3. The processor of claim 2 , wherein ground truth annotations are unavailable in at least a portion of the collection of images. 4. The processor of claim 1 , wherein the one or more characteristics of the object include symmetric consistency between the image of the object and a flipped image of the object. 5. The processor of claim 1 , wherein the one or more circuits are to help train the one or more neural networks to generate a second image of the object having a second orientation. 6. The processor of claim 1 , wherein the object's orientation is encoded on a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. 7. A system, comprising: one or more processors to calculate parameters to help train one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation; and one or more memories to store the parameters. 8. The system of claim 7 , wherein the one or more processors to calculate the parameters to help train the one or more neural networks are to help train the one or more neural networks on a collection of images of different objects of a same category as the object. 9. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: obtaining an input image; using a discriminator to determine at least a predicted viewpoint and a predicted set of appearance parameters; using a generator to create a synthetic image based, at least in part, on the predicted viewpoint and the predicted set of appearance parameters; and computing a viewpoint consistency loss based, at least in part, on the input image and the synthetic image. 10. The system of claim 9 , wherein the input image is a real image. 11. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: obtaining a first viewpoint and a first set of appearance parameters; using a generator to create a synthetic image based, at least in part, on the first viewpoint and the first set of appearance parameters; using a discriminator to predict, based, at least in part, on the synthetic image, a second viewpoint and a second set of appearance parameters; computing a viewpoint consistency loss based, at least in part, on the first viewpoint and the second viewpoint; and computing a reconstruction loss based, at least in part, on the image and the synthetic image. 12. The system of claim 8 , wherein the one or more processors are to train the one or more neural networks by at least: using a generator to create a first synthetic image based, at least in part, on a first viewpoint and a set of appearance parameters; performing a transform on the first viewpoint to obtain a second viewpoint; using the generator to create a second synthetic image based, at least in part, on the second viewpoint and the set of appearance parameters; and computing a symmetry loss based, at least in part, on the first synthetic image and the second synthetic image. 13. The system of claim 12 , wherein the transform flips the first viewpoint horizontally to obtain the second viewpoint. 14. A method, comprising: training one or more neural networks to identify an orientation of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's orientation. 15. The method of claim 14 , wherein training the one or more neural networks comprises training the one or more neural networks in a self-supervised manner on a collection of images of different objects of a same category as the object within the image. 16. The method of claim 15 , wherein training the one or more neural networks in the self-supervised manner comprises using a set of loss functions to evaluate the one or more characteristics of the object other than the object's orientation. 17. The method of claim 15 , wherein the object is of a first category and the method further comprises training the one or more neural networks to identify a second orientation of a second object using a second collection of images, wherein: the second object is of a second category different from the first category; and the second collection of images is of objects of the second category different from the second object. 18. The method of claim 15 , wherein training the one or more neural networks in the self-supervised manner comprises training the one or more neural network to at least: obtain an input image; use a discriminator to predict, from the input image, a viewpoint and a set of parameters; use a generator to create a synthetic image based, at least in part, on the viewpoint and the set of parameters; and compute one or more gradients and update parameters of the discriminator based, at least in part, on the synthetic image. 19. The method of claim 18 , wherein the generator is a deep generative model. 20. The method of claim 19 , wherein the deep generative model is a Tenderer, variational autoencoder, or generative adversarial network (GAN). 21. The method of claim 14 , wherein the object is a vehicle. 22. A processor, comprising: one or more circuits to identify one or more orientations of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's one or more orientations. 23. The processor of claim 22 , wherein the one or more circuits are to train one or more neural networks to identify the one or more orientations of the object within the image. 24. The processor of claim 23 , wherein the one or more neural networks are trained on a collection of images of different objects of a same category as the object. 25. The processor of claim 24 , wherein ground truth annotations are unavailable in the collection of images. 26. The processor of claim 22 , wherein the one or more characteristics of the object includes-symmetric consistency between the image of the object and a flipped image of the object. 27. The processor of claim 22 , wherein the object's one or more orientations are encoded on a set of parameters comprising an azimuth parameter, an elevation parameter, and a tilt parameter. 28. A system, comprising: one or more memories; and one or more processors to identify one or more orientations of an object within an image based, at least in part, on one or more labels indicating one or more characteristics of the object other than the object's one or more orientations. 29. The system of claim 28 , wherein the one or more processors are to train one or more neural networks to identify the one or more orientations of the object within the image based, at least in part, on the one or more
Adversarial learning · CPC title
Generative networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
exterior to a vehicle by using sensors mounted on the vehicle · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.