What technology area does this patent fall under?

Primary CPC classification G06T7/11. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Scalable three-dimensional object recognition in a cross reality system

US11257300B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11257300-B2
Application number	US-202016899878-A
Country	US
Kind code	B2
Filing date	Jun 12, 2020
Priority date	Jun 14, 2019
Publication date	Feb 22, 2022
Grant date	Feb 22, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scalable three-dimensional (3-D) object recognition in a cross reality system. One of the methods includes maintaining object data specifying objects that have been recognized in a scene. A stream of input images of the scene is received, including a stream of color images and a stream of depth images. A color image is provided as input to an object recognition system. A recognition output that identifies a respective object mask for each object in the color image is received. A synchronization system determines a corresponding depth image for the color image. A 3-D bounding box generation system determines a respective 3-D bounding box for each object that has been recognized in the color image. Data specifying one or more 3-D bounding boxes is received as output from the 3-D bounding box generation system.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, the method comprising: maintaining object data specifying objects that have been recognized in a scene in an environment; receiving a stream of input images of the scene, wherein the stream of input images comprises a stream of color images and a stream of depth images; for each of a plurality of color images in the stream of color images: providing the color image as input to an object recognition system; receiving, as output from the object recognition system, a recognition output that identifies a respective object mask in the color image for each of one or more objects that have been recognized in the color image; providing the color image and a plurality of depth images in the stream of depth images as input to a synchronization system that determines a corresponding depth image for the color image based on a timestamp of the corresponding depth image and a timestamp of the color image; providing the object data, the recognition output identifying the object masks, and the corresponding depth image as input to a three-dimensional (3-D) bounding box generation system that determines, from the object data, the object masks, and the corresponding depth image, a respective 3-D bounding box for each of one or more of the objects that have been recognized in the color image; and receiving, as output from the 3-D bounding box generation system, data specifying one or more 3-D bounding boxes for one or more of the objects recognized in the color image; and providing, as output, data specifying the one or more 3-D bounding boxes. 2. The method of claim 1 , wherein the 3-D bounding box generation system comprises: a multi-view fusion system that generates an initial set of 3-D object masks. 3. The method of claim 2 , wherein the object recognition system, the synchronization system, the multi-view fusion system operate in a stateless manner and independently from one another. 4. The method of claim 2 , wherein the multi-view fusion system comprises: an association system that identifies, from the maintained object data, matched object data specifying a corresponding object with the respective object mask of each recognized object in the color image; and a fusion system that generates, for each recognized object in the color image, an initial 3-D object mask by combining the object mask in the color image with the matched object data. 5. The method of claim 2 , wherein the 3-D bounding box generation system further comprises an object refinement system that refines the initial set of 3-D object masks to generate an initial set of 3-D bounding boxes. 6. The method of claim 2 , wherein the 3-D bounding box generation system further comprises a bounding box refinement system that refines the initial set of 3-D bounding boxes to generate the one or more 3-D bounding boxes. 7. The method of claim 1 , wherein the object recognition system comprises a trained deep neural network (DNN) model that takes the color image as input and generates a respective two-dimensional (2-D) object mask for each of the one or more objects that have been recognized in the color image. 8. The method of claim 1 , wherein determining, by the synchronization system, a corresponding depth image for the color image based on timestamps of the corresponding depth images and timestamp of the color image comprises: identifies a candidate depth image which has a closest timestamp to the timestamp of the color image; determining that a time difference between the candidate depth image and the color image is less than a threshold; and in response, determining the candidate depth image as the corresponding depth image for the color image. 9. The method of claim 1 , wherein the 3-D bounding box generation system determines, from the object masks and the corresponding depth image, a respective 3-D object mask for each of the one or more of the objects that have been recognized in the color image, and wherein the method further comprises: receiving, as output from the 3-D bounding box generation system, data specifying one or more 3-D object masks for the one or more of the objects recognized in the color image; and providing, as output, data specifying the one or more 3-D object masks. 10. A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising: maintaining object data specifying objects that have been recognized in a scene in an environment; receiving a stream of input images of the scene, wherein the stream of input images comprises a stream of color images and a stream of depth images; for each of a plurality of color images in the stream of color images: providing the color image as input to an object recognition system; receiving, as output from the object recognition system, a recognition output that identifies a respective object mask in the color image for each of one or more objects that have been recognized in the color image; providing the color image and a plurality of depth images in the stream of depth images as input to a synchronization system that determines a corresponding depth image for the color image based on a timestamp of the corresponding depth image and a timestamp of the color image; providing the object data, the recognition output identifying the object masks, and the corresponding depth image as input to a three-dimensional (3-D) bounding box generation system that determines, from the object data, the object masks, and the corresponding depth image, a respective 3-D bounding box for each of one or more of the objects that have been recognized in the color image; and receiving, as output from the 3-D bounding box generation system, data specifying one or more 3-D bounding boxes for one or more of the objects recognized in the color image; and providing, as output, data specifying the one or more 3-D bounding boxes. 11. The system of claim 10 , wherein the 3-D bounding box generation system comprises a multi-view fusion system that generates an initial set of 3-D object masks, wherein the object recognition system, the synchronization system, the multi-view fusion system operate in a stateless manner and independently from one another. 12. The system of claim 11 , wherein the multi-view fusion system comprises: an association system that identifies, from the maintained object data, matched object data specifying a corresponding object with the respective object mask of each recognized object in the color image; and a fusion system that generates, for each recognized object in the color image, an initial 3-D object mask by combining the object mask in the color image with the matched object data. 13. The system of claim 11 , wherein the 3-D bounding box generation system further comprises an object refinement system that refines the initial set of 3-D object masks to generate an initial set of 3-D bounding boxes. 14. The system of claim 11 , wherein the 3-D bounding box generation system further comprises a bounding box refinement system that refines the initial set of 3-D bounding boxes to generate the one or more 3-D bounding boxes. 15. The system of claim 10 , wherein the object recognition system comprises a trained deep neural network (DNN) model that takes the color image as input and generates a respective two-dimensional (2-D) object mask for each of the one or more objects that have been recognized in the color image. 16. The system of claim 10 , wherein determining, by t

Assignees

Magic Leap Inc

Inventors

Classifications

G06T7/11Primary
Region-based segmentation · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06T19/20Primary
Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts · CPC title
G06V20/00
Scenes; Scene-specific elements (control of digital cameras H04N23/60) · CPC title
G06T2207/10024
Color image · CPC title

Patent family

Related publications grouped by family.

View patent family 73745180

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11257300B2 cover?: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scalable three-dimensional (3-D) object recognition in a cross reality system. One of the methods includes maintaining object data specifying objects that have been recognized in a scene. A stream of input images of the scene is received, including a stream of color images and a stream of dept…
Who is the assignee on this patent?: Magic Leap Inc
What technology area does this patent fall under?: Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 22 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

3-dimensional scene analysis for augmented reality operations

Recognition-based object segmentation of a 3-dimensional image

Context-based priors for object detection in images

Augmenting layer-based object detection with deep convolutional neural networks

3d object segmentation

Vision-based multi-camera factory monitoring with dynamic integrity scoring

Estimation of object properties in 3D world

Frequently asked questions