System for and method of generating user-selectable novel views on a viewing device
US-2018367788-A1 · Dec 20, 2018 · US
US10438371B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10438371-B2 |
| Application number | US-201715797573-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 30, 2017 |
| Priority date | Sep 22, 2017 |
| Publication date | Oct 8, 2019 |
| Grant date | Oct 8, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A three-dimensional bounding box is determined from a two-dimensional image and a point cloud. A feature vector associated with the image and a feature vector associated with the point cloud may be passed through a neural network to determine parameters of the three-dimensional bounding box. Feature vectors associated with each of the points in the point cloud may also be determined and considered to produce estimates of the three-dimensional bounding box on a per-point basis.
Opening claim text (preview).
What we claim is: 1. A system for estimating a three-dimensional bounding box, the system including a non-transitory computer readable medium containing instructions that, when executed by one or more processors, cause the system to: receive an image captured from an image capture device; detect an object in the image; crop the image to form a cropped image including the object; receive point cloud data associated with the object; determine, using a first processing algorithm, a first feature vector associated with the point cloud data, the first feature vector comprising a geometric feature vector associated with one or more locations of one or more points in the point cloud data; determine, using a second processing algorithm, a second feature vector associated with the cropped image, the second feature vector comprising an appearance feature vector; pass the first feature vector and the second feature vector into a neural network; and receive, from the neural network, coordinates descriptive of a three-dimensional bounding box associated with the object. 2. The system of claim 1 , wherein the instructions further cause the system to: determine a plurality of third feature vectors, a first of the plurality of third feature vectors corresponding to a first point in the point cloud data and a second of the plurality of third feature vectors corresponding to a second point in the point cloud data; pass the plurality of third feature vectors into the neural network with the first plurality of feature vectors and the second plurality of feature vectors; determine for the first point a first set of offsets and a first confidence score, the first set of offsets corresponding to first estimated positions of corners of the three-dimensional bounding box relative to the first point; and determine for the second point a second set of offsets and a second confidence score, the second set of offsets corresponding to second estimated positions of the corners of the three-dimensional bounding box relative to the second point, wherein the system receives coordinates corresponding to the first estimated positions when the first confidence score is higher than the second confidence score and the system receives coordinates corresponding to the second estimated positions when the second confidence score is higher than the first confidence score. 3. The system of claim 1 , wherein the instructions cause the system to extract the first feature vector from a processing layer of a point cloud neural network configured to process raw point cloud data, and wherein the instructions cause the system to extract the second feature vector from a residual learning neural network. 4. The system of claim 1 , wherein the neural network comprises one or more of fully connected layers. 5. The system of claim 1 , wherein the instructions further cause the system to: normalize the point cloud data by transforming the point cloud data to the origin. 6. The system of claim 1 , wherein the coordinates comprise eight points, each of the eight points associated with a respective corner of the three-dimensional bounding box. 7. The system of claim 2 , wherein the first neural network is trained in a supervised manner using a dataset identifying whether points are within a three-dimensional bounding box or outside the three-dimensional bounding box. 8. The system of claim 2 , wherein the first neural network is trained using a bounding box loss function comprising a regression loss for the bounding box. 9. The system of claim 2 , wherein the instructions further cause the system to: determine a first portion of the cropped image associated with the first point; and determine a second portion of the cropped image associated with the second point, wherein at least one of the first portion or the second portion is determined, at least in part, using bilinear interpolation. 10. A computer-implemented method for estimating a three-dimensional bounding box of an object in an environment, the computer-implemented method comprising: receiving an image of the environment from an image capture device; receiving point cloud data associated with the environment, the point cloud data comprising a plurality of points; detecting an object in the image; cropping the image to form a cropped image comprising an image of the object; inputting the cropped image into a first neural network; inputting the point cloud into a second neural network; extracting from the first neural network an appearance feature vector associated with the cropped image; extracting from the second neural network a global geometric feature vector associated with the point cloud data; extracting from the second neural network a plurality of per-point geometric feature vectors, individual of the per-point geometric feature vectors being associated with individual of the plurality of points; inputting the appearance feature vector, the global geometric feature vector, and the plurality of per-point geometric feature vectors into a third neural network; and receiving from the third neural network information associated with a three-dimensional bounding box of the object. 11. The computer-implemented method of claim 10 , wherein the receiving the information associated with the three-dimensional bounding box comprises receiving a plurality of displacements relative to a point in the point cloud, the displacements corresponding to corners of the three-dimensional bounding box. 12. The computer-implemented method of claim 10 , wherein the third neural network determines, for each point in the point cloud, a plurality of offsets and a confidence score, wherein the offsets comprise displacements from estimated corners of the three-dimensional bounding box relative to the respective point, and wherein the receiving the three-dimensional bounding box comprises receiving parameters associated with the point having the highest confidence score. 13. The computer-implemented method of claim 10 , wherein the third neural network is trained using a bounding box loss function comprising a regression loss for the bounding box. 14. The computer-implemented method of claim 10 , wherein the third neural network is trained in a supervised manner using an indication of whether a point is inside a three-dimensional bounding box or outside the three-dimensional bounding box. 15. The computer-implemented method of claim 10 , wherein the inputting the image appearance feature vector, the global geometric feature vector, and the plurality of per-point geometric feature vectors into a third neural network comprises concatenating each individual of the per-point geometric feature vectors with the global geometric feature vector. 16. A system for estimating a three-dimensional bounding box, the system comprising: an autonomous vehicle; an image capture device associated with the autonomous vehicle and configured to capture images in an environment of the autonomous vehicle; a sensor associated with the autonomous vehicle and configured to output point cloud data corresponding to the environment; one or more processors; and non-transitory computer readable medium containing instructions that, when executed by the one or more processors, cause the system to: receive an image captured by the image capture device; detect an object in the image; crop the image to form a cropped image including the object; receive the point cloud data; determine, using a first processing algorithm, a first feature vector associated with the point cloud data, the first feature vector being a
involving 3D image data · CPC title
Traffic on road, railway or crossing · CPC title
involving the use of neural networks · CPC title
of extracted features · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.