Voxel-based feature learning network
US-10970518-B1 · Apr 6, 2021 · US
US11520347B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11520347-B2 |
| Application number | US-201916255789-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 23, 2019 |
| Priority date | Jan 23, 2019 |
| Publication date | Dec 6, 2022 |
| Grant date | Dec 6, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to various embodiments, systems and methods described in the disclosure combine mapped features with point cloud features to improve object detection precision of an autonomous driving vehicle (ADV). The map features and the point cloud features can be extracted from a perception area of the ADV within a particular angle view at each driving cycle based on a position of the ADV. The map features and the point cloud features can be concatenated and provided to a neutral network for object detections.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of operating an autonomous driving vehicle (ADV), the method comprising: extracting a plurality of map features from a map associated with a road in which the ADV is driving, wherein the plurality of map features are extracted from a portion of the map, the portion of the map corresponding to a perception area of the ADV within a particular angle of view at each driving cycle, wherein the particular angle of view corresponds to a heading of the ADV; extracting a plurality of point cloud features from a portion of a point cloud of LIDAR data, the portion of the point cloud corresponding to the perception area of the ADV within the particular angle of view; concatenating the plurality of point cloud features and the plurality of map features into a feature list; providing the feature list as input to one or more neural networks, which detect one or more objects in a driving environment based on the input; and generating a trajectory during each driving cycle of the ADV in view of the detected objects to drive the ADV through the detected objects. 2. The method of claim 1 , wherein the plurality of map features are extracted using a convolution neural network, and include one or more lanes, one or more lane boundaries, one or more traffic signs, and one or more road curbs. 3. The method of claim 1 , wherein the extracting the plurality of map features from the map includes: forming a plurality of layers, each layer corresponding to one of the plurality of map features extracted from the map; converting the plurality of layers into a red, green, and blue (RGB) representation; and extracting the plurality of map features from the RGB representation. 4. The method of claim 1 , wherein the extracted map features are pre-calculated and cached to speed up inference of the one or more neural networks. 5. The method of claim 1 , wherein the plurality of point cloud features are extracted using a fully connected network, which is to partition a space within the perception area into a plurality of equally spaced voxels, to encode each non-empty voxel with a plurality of point-wise features, and to combine the point-wise features with a locally aggregated feature. 6. The method of claim 5 , wherein the plurality of point-wise features for each non-empty voxel represent statistical quantities derived from all LiDAR points within that voxel, and include a distance from the center of the voxel to an origin of the point cloud, a maximum height of LiDAR points within the voxel, and a mean height of LiDAR points within the voxel. 7. The method of claim 1 , wherein the one or more neural networks include a convolution neural network and a region proposal network, wherein the convolution neural network generates a feature map based on the plurality of map features and the plurality of point cloud features, and wherein the region proposal network maps the feature map to one or more desired learning targets to generate object detections. 8. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, causing the processor to perform operations of operating an autonomous driving vehicle (ADV), the operations comprising: extracting a plurality of map features from a map associated with a road in which the ADV is driving, wherein the plurality of map features are extracted from a portion of the map, the portion of the map corresponding to a perception area of the ADV within a particular angle of view at each driving cycle, wherein the particular angle of view corresponds to a heading of the ADV; extracting a plurality of point cloud features from a portion of a point cloud of LIDAR data, the portion of the point cloud corresponding to the perception area of the ADV within the particular angle of view; concatenating the plurality of point cloud features and the plurality of map features into a feature list; providing the feature list as input to one or more neural networks, which detect one or more objects in a driving environment based on the input; and generating a trajectory during each driving cycle of the ADV in view of the detected objects to drive the ADV through the detected objects. 9. The machine-readable medium of claim 8 , wherein the plurality of map features are extracted using a convolution neural network, and include one or more lanes, one or more lane boundaries, one or more traffic signs, and one or more road curbs. 10. The machine-readable medium of claim 8 , wherein the extracting the plurality of map features from the map comprises: forming a plurality of layers, each layer corresponding to one of the plurality of map features extracted from the map; converting the plurality of layers into a red, green, and blue (RGB) representation; and extracting the plurality of map features from the RGB representation using one or more convolution layers of a convolution neural network. 11. The machine-readable medium of claim 8 , wherein the extracted map features are pre-calculated and cached to speed up inference of the one or more neural networks. 12. The machine-readable medium of claim 8 , wherein the plurality of point cloud features are extracted using a fully connected network, which is to partition a space within the perception area into a plurality of equally spaced voxels, to encode each non-empty voxel with a plurality of point-wise features, and to combine the point-wise features with a locally aggregated feature. 13. The machine-readable medium of claim 12 , wherein the plurality of point-wise features for each non-empty voxel represent statistical quantities derived from all LiDAR points within that voxel, and include a distance from the center of the voxel to an origin of the point cloud, a maximum height of LiDAR points within the voxel, and a mean height of LiDAR points within the voxel. 14. The machine-readable medium of claim 8 , wherein the one or more neural networks include a convolution neural network and a region proposal network, wherein the convolution neural network generates a feature map based on the plurality of map features and the plurality of point cloud features, and wherein the region proposal network maps the feature map to one or more desired learning targets to generate object detections. 15. A data processing system, comprising: a processor; and a memory coupled to the processor to store instructions, which when executed by a processor, causing the processor to perform operations of operating an autonomous driving vehicle (ADV), the operations comprising: extracting a plurality of map features from a map associated with a road in which the ADV is driving, wherein the plurality of map features are extracted from a portion of the map, the portion of the map corresponding to a perception area of the ADV within a particular angle of view at each driving cycle, wherein the particular angle of view corresponds to a heading of the ADV, extracting a plurality of point cloud features from a portion of a point cloud of LIDAR data , the portion of the point cloud corresponding to the perception area of the ADV within the particular angle of view, concatenating the plurality of point cloud features and the plurality of map features into a feature list, providing the feature list as input to one or more neural networks, which detect one or more objects in a driving environment based on the input, and generating a trajectory during each driving cycle of the ADV in view of the detected objects to drive the ADV through the detected objects. 16. The system of claim 15
Combinations of networks · CPC title
of land vehicles · CPC title
for mapping or imaging · CPC title
Map- or contour-matching · CPC title
Combination of radar systems with lidar systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.