Who is the assignee on this patent?

Corral Soto Eduardo R, Nezhadarya Ehsan, Liu Bingbing, and 1 more

What technology area does this patent fall under?

Primary CPC classification G06V20/64. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 09 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for converting point cloud data for use with 2D convolutional neural networks

US10915793B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10915793-B2
Application number	US-201816184570-A
Country	US
Kind code	B2
Filing date	Nov 8, 2018
Priority date	Nov 8, 2018
Publication date	Feb 9, 2021
Grant date	Feb 9, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for encoding 3D data for use with 2D convolutional neural networks (CNNs) are described. A set of 3D data is encoded into a set of one or more arrays. A 2D index of the arrays is calculated by projecting 3D coordinates of the 3D point onto a 2D image plane that is defined by a set of defined virtual camera parameters. The virtual camera parameters include a camera projection matrix defining the 2D image plane. Each 3D coordinate of the point is stored in the arrays at the calculated 2D index. The set of encoded arrays is provided for input to a 2D CNN, for training or inference.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: receiving a set of 3D data in the form of a point cloud; encoding the set of 3D data into a set of arrays, the encoding including projecting data points in the set of 3D data onto a common 2D image plane by, for each data point of the 3D data: projecting 3D coordinates of the data point onto the 2D image plane to determine a projected 2D index, the 2D image plane being defined by a set of defined virtual camera parameters, the set of virtual camera parameters including a camera projection matrix defining the 2D image plane; and storing each 3D coordinate of the data point in a respective array of the set of arrays at the projected 2D index; and outputting the set of arrays for input to a 2D convolutional neural network (CNN) configured to perform object detection on the set of arrays during training or inference of the 2D CNN. 2. The method of claim 1 , further comprising: adjusting the 3D data according to one or more predefined parameters to generate a set of adjusted 3D data; wherein the adjusting is at least one of: scaling, shifting, normalizing or quantizing the 3D coordinates; and wherein the set of arrays is generated from the adjusted 3D data. 3. The method of claim 1 , wherein the set of arrays contains values adjusted to be recognizable as image data. 4. The method of claim 1 , wherein the virtual camera parameters include a definition of a region of interest (ROI), the method further comprising: defining a subset of the 3D data corresponding to the ROI; and encoding the set of arrays from the subset of the 3D data. 5. The method of claim 1 , further comprising filling in any holes in the set of arrays using dilation. 6. The method of claim 1 , further comprising: encoding the set of 3D data to a second set of arrays, the encoding including projecting data points in the set of 3D data onto a common second 2D image plane, using a second set of virtual camera parameters defining the second 2D image plane. 7. The method of claim 1 , wherein the set of virtual camera parameters correspond to parameters of an optical camera. 8. The method of claim 7 , further comprising: combining the set of arrays with a set of 2D image data obtained by the optical camera, to generate a set of combined data; inputting the set of combined 2D data to the 2D CNN configured to perform object detection, the 2D CNN detecting objects in the set of combined 2D and outputting a feature array associated with each detected object; and performing 2D bounding box generation on the feature array associated with each detected object to generate a 2D object bounding for each detected object and performing 2D object segmentation on the feature array associated with each detected object to generate a 2D object mask for each detected object. 9. The method of claim 8 , wherein combining comprises: performing spatial registration between the set of arrays and the set of 2D image data; and concatenating the set of arrays with the set of 2D image data. 10. A method comprising: receiving a set of 3D data in the form of a point cloud; encoding the set of 3D data into a set of one or more arrays by, for each data point of the 3D data: calculating a 2D index of the one or more arrays by projecting 3D coordinates of the data point onto a 2D image plane defined by a set of defined virtual camera parameters, the set of virtual camera parameters including a camera projection matrix defining the 2D image plane; and storing each 3D coordinate of the data point in the one or more arrays at the calculated 2D index; performing object detection using the set of one or more arrays as input to a 2D convolutional neural network (CNN) configured to output a set of detected objects; performing object classification, regression, and segmentation using the set of detected objects as input to a 2D segmentation and regression unit configured to label the set of detected objects with classes of interest, output a set of 2D object bounding boxes and output a set of 2D object masks for the set of detected objects. 11. The method of claim 10 , wherein a mapping index is stored to associate each calculated 2D index to the respective point in the 3D data, the method further comprising: performing a 3D semantic segmentation by associating each 2D object mask with a respective cluster of points in the 3D data, using the mapping index; and outputting a set of 3D object masks. 12. The method of claim 10 , further comprising: performing a 3D regression, using a 3D regression network, that is trained to regress parameters of a 3D bounding box corresponding to subset of the data points in the set of one or more arrays, to generate output a set of 3D object bounding boxes. 13. A processing unit comprising: a processor; one or more memories coupled to the processor, the one or more memories storing instructions which when executed by the processor cause the processing unit to: receive a set of 3D data in the form of a point cloud; encode the 3D data to a set of arrays, the encoding including projecting data points in the set of 3D data onto a common 2D image plane by, for each point of the 3D data: projecting 3D coordinates of the point onto the 2D image plane to determine a projected 2D index, the 2D image plane being defined by a set of defined virtual camera parameters, the set of virtual camera parameters including a camera projection matrix defining the 2D image plane; and storing each 3D coordinate of the point in a respective array of the set of arrays at the projected 2D index; and output the set of arrays for input to a 2D convolutional neural network (CNN) configured to perform object detection on the set of arrays during training or inference. 14. The processing unit claim 13 , wherein the one or more memories store further instructions which when executed by the processor cause the processing unit to: adjust the 3D data according to one or more predefined parameters to generate a set of adjusted 3D data; wherein the 3D data is adjusted by performing at least one of: scaling, shifting, normalizing or quantizing the 3D coordinates; and wherein the set of arrays is generated from the adjusted 3D data. 15. The processing unit of claim 13 , wherein the adjusted 3D data contains values adjusted to be recognizable as image data. 16. The processing unit of claim 13 , wherein the virtual camera parameters include a definition of a region of interest (ROI), and wherein the processing unit is further configured to implement the data analysis system to: define a subset of the 3D data corresponding to the ROI; and encode the set of arrays from the subset of the 3D data. 17. The processing unit of claim 13 , wherein the one or more memories store further instructions which when executed by the processor cause the processing unit to fill in any holes in the set of arrays using dilation. 18. The processing unit of claim 13 , wherein the one or more memories store further instructions which when executed by the processor cause the processing unit to: encode the set of 3D data to a second set of arrays, the encoding including projecting data points in the set of 3D data onto a common second 2D image plane, using a second set of virtual camera parameters defining a second 2D image plane. 19. The processing unit of claim 18 , wherein a mapping index is stored to map each calculated projected 2D index to the respective point in the 3D data, wherein the one or more memories store further instructions whic

Assignees

Inventors

Classifications

G06V20/58
Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title
G06V10/803
of input or preprocessed data · CPC title
G06V10/255
Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title
G06V20/64Primary
Three-dimensional [3D] objects · CPC title
G06V10/82
using neural networks · CPC title

Patent family

Related publications grouped by family.

View patent family 70550266

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10915793B2 cover?: Methods and systems for encoding 3D data for use with 2D convolutional neural networks (CNNs) are described. A set of 3D data is encoded into a set of one or more arrays. A 2D index of the arrays is calculated by projecting 3D coordinates of the 3D point onto a 2D image plane that is defined by a set of defined virtual camera parameters. The virtual camera parameters include a camera projection…
Who is the assignee on this patent?: Corral Soto Eduardo R, Nezhadarya Ehsan, Liu Bingbing, and 1 more
What technology area does this patent fall under?: Primary CPC classification G06V20/64. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 09 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).