Three-dimensional bounding box from two-dimensional image and point cloud data

US10438371B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10438371-B2
Application numberUS-201715797573-A
CountryUS
Kind codeB2
Filing dateOct 30, 2017
Priority dateSep 22, 2017
Publication dateOct 8, 2019
Grant dateOct 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A three-dimensional bounding box is determined from a two-dimensional image and a point cloud. A feature vector associated with the image and a feature vector associated with the point cloud may be passed through a neural network to determine parameters of the three-dimensional bounding box. Feature vectors associated with each of the points in the point cloud may also be determined and considered to produce estimates of the three-dimensional bounding box on a per-point basis.

First claim

Opening claim text (preview).

What we claim is: 1. A system for estimating a three-dimensional bounding box, the system including a non-transitory computer readable medium containing instructions that, when executed by one or more processors, cause the system to: receive an image captured from an image capture device; detect an object in the image; crop the image to form a cropped image including the object; receive point cloud data associated with the object; determine, using a first processing algorithm, a first feature vector associated with the point cloud data, the first feature vector comprising a geometric feature vector associated with one or more locations of one or more points in the point cloud data; determine, using a second processing algorithm, a second feature vector associated with the cropped image, the second feature vector comprising an appearance feature vector; pass the first feature vector and the second feature vector into a neural network; and receive, from the neural network, coordinates descriptive of a three-dimensional bounding box associated with the object. 2. The system of claim 1 , wherein the instructions further cause the system to: determine a plurality of third feature vectors, a first of the plurality of third feature vectors corresponding to a first point in the point cloud data and a second of the plurality of third feature vectors corresponding to a second point in the point cloud data; pass the plurality of third feature vectors into the neural network with the first plurality of feature vectors and the second plurality of feature vectors; determine for the first point a first set of offsets and a first confidence score, the first set of offsets corresponding to first estimated positions of corners of the three-dimensional bounding box relative to the first point; and determine for the second point a second set of offsets and a second confidence score, the second set of offsets corresponding to second estimated positions of the corners of the three-dimensional bounding box relative to the second point, wherein the system receives coordinates corresponding to the first estimated positions when the first confidence score is higher than the second confidence score and the system receives coordinates corresponding to the second estimated positions when the second confidence score is higher than the first confidence score. 3. The system of claim 1 , wherein the instructions cause the system to extract the first feature vector from a processing layer of a point cloud neural network configured to process raw point cloud data, and wherein the instructions cause the system to extract the second feature vector from a residual learning neural network. 4. The system of claim 1 , wherein the neural network comprises one or more of fully connected layers. 5. The system of claim 1 , wherein the instructions further cause the system to: normalize the point cloud data by transforming the point cloud data to the origin. 6. The system of claim 1 , wherein the coordinates comprise eight points, each of the eight points associated with a respective corner of the three-dimensional bounding box. 7. The system of claim 2 , wherein the first neural network is trained in a supervised manner using a dataset identifying whether points are within a three-dimensional bounding box or outside the three-dimensional bounding box. 8. The system of claim 2 , wherein the first neural network is trained using a bounding box loss function comprising a regression loss for the bounding box. 9. The system of claim 2 , wherein the instructions further cause the system to: determine a first portion of the cropped image associated with the first point; and determine a second portion of the cropped image associated with the second point, wherein at least one of the first portion or the second portion is determined, at least in part, using bilinear interpolation. 10. A computer-implemented method for estimating a three-dimensional bounding box of an object in an environment, the computer-implemented method comprising: receiving an image of the environment from an image capture device; receiving point cloud data associated with the environment, the point cloud data comprising a plurality of points; detecting an object in the image; cropping the image to form a cropped image comprising an image of the object; inputting the cropped image into a first neural network; inputting the point cloud into a second neural network; extracting from the first neural network an appearance feature vector associated with the cropped image; extracting from the second neural network a global geometric feature vector associated with the point cloud data; extracting from the second neural network a plurality of per-point geometric feature vectors, individual of the per-point geometric feature vectors being associated with individual of the plurality of points; inputting the appearance feature vector, the global geometric feature vector, and the plurality of per-point geometric feature vectors into a third neural network; and receiving from the third neural network information associated with a three-dimensional bounding box of the object. 11. The computer-implemented method of claim 10 , wherein the receiving the information associated with the three-dimensional bounding box comprises receiving a plurality of displacements relative to a point in the point cloud, the displacements corresponding to corners of the three-dimensional bounding box. 12. The computer-implemented method of claim 10 , wherein the third neural network determines, for each point in the point cloud, a plurality of offsets and a confidence score, wherein the offsets comprise displacements from estimated corners of the three-dimensional bounding box relative to the respective point, and wherein the receiving the three-dimensional bounding box comprises receiving parameters associated with the point having the highest confidence score. 13. The computer-implemented method of claim 10 , wherein the third neural network is trained using a bounding box loss function comprising a regression loss for the bounding box. 14. The computer-implemented method of claim 10 , wherein the third neural network is trained in a supervised manner using an indication of whether a point is inside a three-dimensional bounding box or outside the three-dimensional bounding box. 15. The computer-implemented method of claim 10 , wherein the inputting the image appearance feature vector, the global geometric feature vector, and the plurality of per-point geometric feature vectors into a third neural network comprises concatenating each individual of the per-point geometric feature vectors with the global geometric feature vector. 16. A system for estimating a three-dimensional bounding box, the system comprising: an autonomous vehicle; an image capture device associated with the autonomous vehicle and configured to capture images in an environment of the autonomous vehicle; a sensor associated with the autonomous vehicle and configured to output point cloud data corresponding to the environment; one or more processors; and non-transitory computer readable medium containing instructions that, when executed by the one or more processors, cause the system to: receive an image captured by the image capture device; detect an object in the image; crop the image to form a cropped image including the object; receive the point cloud data; determine, using a first processing algorithm, a first feature vector associated with the point cloud data, the first feature vector being a

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10438371B2 cover?
A three-dimensional bounding box is determined from a two-dimensional image and a point cloud. A feature vector associated with the image and a feature vector associated with the point cloud may be passed through a neural network to determine parameters of the three-dimensional bounding box. Feature vectors associated with each of the points in the point cloud may also be determined and conside…
Who is the assignee on this patent?
Zoox Inc
What technology area does this patent fall under?
Primary CPC classification G01S7/417. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).