What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Object recognition neural network training using multiple data sources

Patent metadata
Field	Value
Publication number	US-12430903-B2
Application number	US-202118007288-A
Country	US
Kind code	B2
Filing date	Jul 28, 2021
Priority date	Jul 29, 2020
Publication date	Sep 30, 2025
Grant date	Sep 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an object recognition neural network using multiple data sources. One of the methods includes receiving training data that includes a plurality of training images from a first source and images from a second source. A set of training images are obtained from the training data. For each training image in the set of training images, contrast equalization is applied to the training image to generate a modified image. The modified image is processed using the neural network to generate an object recognition output for the modified image. A loss is determined based on errors between, for each training image in the set, the object recognition output for the modified image generated from the training image and ground-truth annotation for the training image. Parameters of the neural network are updated based on the determined loss.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for neural network training, comprising: receiving training data that comprises a plurality of training images and, for each image, a respective ground-truth annotation, the plurality of training images comprising images from a first source and images from a second source; obtaining a set of training images from the training data; for each training image in the set of training images: applying contrast equalization to the training image to generate a modified image; and processing the modified image using a neural network to generate an object recognition output for the modified image; determining, as a determined loss, a loss based on errors between, for each training image in the set of training images, the object recognition output for the modified image generated from the training image and the respective ground-truth annotation for the training image; and updating parameters of the neural network based on the determined loss. 2. The computer-implemented method of claim 1 , wherein obtaining a set of training images from the training data, comprises: sampling an initial set of images from the training data; and generating the set of training images by discarding one or more images from the initial set of images. 3. The computer-implemented method of claim 2 , wherein generating the set of training images comprises: determining that the one or more images in the initial set of images have motion blur; and in response, discarding the one or more images that have motion blur. 4. The computer-implemented method of claim 2 , wherein generating the set of training images comprises: determining, from respective ground-truth annotations for the training images in the initial set of images, that one or more of the images in the initial set of images depict objects that do not belong to a relevant object category; and in response, discarding the one or more images that depict objects that do not belong to a relevant object category. 5. The computer-implemented method of claim 2 , wherein generating the set of training images comprises: determining that one or more of the images in the initial set of images depict an object that is truncated or occluded; and in response, discarding the one or more images that depict an object that is truncated or occluded. 6. The computer-implemented method of claim 5 , wherein determining that one or more of the images in the set of training images depict an object that is truncated or occluded comprises: obtaining, from respective ground-truth annotations for the training images in the initial set of images, truncation scores or occlusion scores previously computed based on the respective ground-truth annotations, and wherein computing the truncation scores or occlusion scores comprising: obtaining, from the respective ground-truth annotations, a three-dimensional (3-D) bounding box and a two-dimensional (2-D) bounding box for an object in a training image from the initial set of images; generating a projected 2-D bounding box by projecting the 3-D bounding box to the training image; and computing a truncation score or an occlusion score using an overlap between the projected 2-D bounding box and the 2-D bounding box from the respective ground-truth annotations; and determining, based on the truncation scores or occlusion scores, that one or more of the images in the initial set of images depict an object that is truncated or occluded. 7. The computer-implemented method of claim 1 , wherein determining the loss comprises: for each training image in the set of training images: determining a count of images from the set of training images that have a same ground-truth annotation as the training image; determining, based on the count of images, a weight for the training image; and generating, from an error between the object recognition output for the modified image generated from the training image and the respective ground-truth annotation for the training image, a weighted error based on the weight for the training image. 8. The computer-implemented method of claim 7 , comprising: determining the loss based on weighted errors for training images in the set of training images. 9. The computer-implemented method of claim 7 , wherein the respective ground-truth annotation for the training image depicts an object that belongs to a k-th object category among K object categories, and wherein the weight w k for the training image is w k = 1 + 2 * ( 1 - c k c max ) , where c k is the count of images from the set of training images that has the same ground-truth annotation as the training image, and c max is a maximum value of all values among counts of images c i ,i=1, . . . , K. 10. The computer-implemented method of claim 1 , wherein the first source is a set of real-world images and the second source is a set of synthetic images. 11. The computer-implemented method of claim 1 , wherein the object recognition output comprises: a bounding box, and a localization score that is a prediction of an intersection-over-union overlap between the bounding box and a ground-truth bounding box. 12. The computer-implemented method of claim 1 , wherein the object recognition output comprises: an instance mask, and a mask score that is a prediction of an intersection-over-union overlap between the instance mask and a ground-truth instance mask. 13. A non-transitory, computer-readable medium storing one or more instructions executable by a computer system to perform operations comprising: receiving training data that comprises a plurality of training images and, for each image, a respective ground-truth annotation, the plurality of training images comprising images from a first source and images from a second source; obtaining a set of training images from the training data; for each training image in the set of training images: applying contrast equalization to the training image to generate a modified image; and processing the modified image using a neural network to generate an object recognition output for the modified image; determining, as a determined loss, a loss based on errors between, for each training image in the set of training images, the object recognition output for the modified image generated from the training image and the respective ground-truth annotation for the training image; and updating parameters of the neural network based on the determined loss. 14. The non-transitory, computer-readable medium of claim 13 , wherein obtaining a set of training images from the training data, comprises: sampling an initial set of images from the training data; and generating the set of training images by discarding one or more training images from t

Assignees

Magic Leap Inc

Inventors

Classifications

G06V10/764
using classification, e.g. of video objects · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/045
Combinations of networks · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

View patent family 80036168

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12430903B2 cover?: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an object recognition neural network using multiple data sources. One of the methods includes receiving training data that includes a plurality of training images from a first source and images from a second source. A set of training images are obtained from the training data. For eac…
Who is the assignee on this patent?: Magic Leap Inc
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).