Object detection and classification using lidar range images for autonomous machine applications
US-2021063578-A1 · Mar 4, 2021 · US
US11623661B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11623661-B2 |
| Application number | US-202017068425-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 12, 2020 |
| Priority date | Oct 12, 2020 |
| Publication date | Apr 11, 2023 |
| Grant date | Apr 11, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for controlling a vehicle based on height data and/or classification data being determined utilizing multi-channel image data are discussed herein. The vehicle can capture lidar data as it traverses an environment. The lidar data can be associated with a voxel space as three-dimensional data. Semantic information can be determined and associated with the lidar data and/or the three-dimensional voxel space. A multi-channel input image can be determined based on the three-dimensional voxel space and input into a machine learned (ML) model. The ML model can output data to determine height data and/or classification data associated with a ground surface of the environment. The height data and/or classification data can be utilized to determine a mesh associated with the ground surface. The mesh can be used to control the vehicle and/or determine additional objects proximate the vehicle.
Opening claim text (preview).
What is claimed is: 1. A system comprising: one or more processors; and one or more computer-readable media storing instructions executable by the one or more processors, wherein the instructions, when executed, cause the system to perform operations comprising: receiving lidar data of an environment captured by a lidar sensor of an autonomous vehicle; determining semantic information associated with the lidar data; associating the lidar data with a three-dimensional voxel space; associating the semantic information with the three-dimensional voxel space; determining, based at least in part on a portion of the three-dimensional voxel space, first multi-channel image data comprising feature data; inputting the first multi-channel image data to a machine learned (ML) model; receiving, from the ML model, second multi-channel image data comprising height data and classification data; determining, based at least in part on the height data and the classification data, a mesh associated with a ground surface of the environment; and controlling the autonomous vehicle based at least in part on the mesh. 2. The system of claim 1 , the operations further comprising: determining, as the feature data, at least one of: minimum height data; maximum height data; average height data; covariance data; or surface normal data. 3. The system of claim 1 , wherein the first multi-channel image data comprises a first two-dimensional layer associated with first feature data and a second two-dimensional layer associated with second feature data. 4. The system of claim 1 , wherein the semantic information comprises a classification probability that a lidar point is associated with a drivable surface. 5. The system of claim 1 , wherein determining the mesh associated with the ground surface comprises: determining a mesh vertex based at least in part on the height data; and determining a polygon based at least in part on the mesh vertex. 6. A method comprising: receiving lidar data of an environment captured by a lidar sensor of an autonomous vehicle; determining, based at least in part on the lidar data, multi-channel image data; inputting the multi-channel image data to a machine learned (ML) model; determining, based on output data received from the ML model, ground surface height data and ground surface classification data; and controlling the autonomous vehicle based on the ground surface height data and the ground surface classification data. 7. The method of claim 6 , further comprising: associating the lidar data with a three-dimensional voxel space; determining, based at least in part on the lidar data, semantic information; associating the semantic information with the three-dimensional voxel space; and determining the multi-channel image data based at least in part on the three-dimensional voxel space. 8. The method of claim 6 , wherein the output data comprises a first layer associated with the ground surface height data and a second layer associated with the ground surface classification data. 9. The method of claim 6 , further comprising: associating the lidar data with a three-dimensional voxel space; and determining a portion of the multi-channel image data based at least in part on a column of the three-dimensional voxel space. 10. The method of claim 6 , further comprising: determining, based on the ground surface height data and the ground surface classification data, second multi-channel image data; determining, for a point of the lidar data, a distance from a portion of the ground surface height data; and determining, based at least in part on the distance, that the point is associated with an object. 11. The method of claim 6 , further comprising: determining a probability that a lidar point of the lidar data is associated with a ground surface of the environment, wherein the multi-channel image data comprises the probability. 12. The method of claim 6 , further comprising: determining, based at least in part on the output data, a mesh associated with a ground surface of the environment. 13. One or more non-transitory computer-readable media storing instructions executable by a processor, wherein the instructions, when executed, cause the processor to perform operations comprising: receiving lidar data of an environment captured by a lidar sensor of an autonomous vehicle; determining, based at least in part on the lidar data, multi-channel image data; inputting the multi-channel image data to a machine learned (ML) model; determining, based on output data received from the ML model, ground surface height data and ground surface classification data; and controlling the autonomous vehicle based on the ground surface height data and the ground surface classification data. 14. The one or more non-transitory computer-readable media of claim 13 , wherein the multi-channel image data comprises a first channel associated with first feature data and a second channel associated with second feature data. 15. The one or more non-transitory computer-readable media of claim 13 , further comprising: associating the lidar data with a three-dimensional voxel space; determining, based at least in part on the lidar data, semantic information; associating the semantic information with the three-dimensional voxel space; and determining the multi-channel image data based at least in part on the three-dimensional voxel space. 16. The one or more non-transitory computer-readable media of claim 13 , wherein the output data comprises a first layer associated with the ground surface height data and a second layer associated with the ground surface classification data. 17. The one or more non-transitory computer-readable media of claim 13 , further comprising: associating the lidar data with a three-dimensional voxel space; and determining a portion of the multi-channel image data based at least in part on a column of the three-dimensional voxel space. 18. The one or more non-transitory computer-readable media of claim 13 , the operations further comprising: determining, based on the ground surface height data and the ground surface classification data, second multi-channel image data; determining, for a point of the lidar data, a distance from a portion of the ground surface height data; and determining, based at least in part on the distance, that the point is associated with an object. 19. The one or more non-transitory computer-readable media of claim 13 , the operations further comprising: determining a probability that a lidar point of the lidar data is associated with a ground surface of the environment, wherein the multi-channel image data comprises the probability. 20. The one or more non-transitory computer-readable media of claim 13 , wherein the ML model is a first algorithm and is trained based at least in part on ground truth data, the ground truth data determined, based at least in part on at least one of map data, hand-annotated data, or sensor data input by the autonomous vehicle as the autonomous vehicle moves through the environment.
for active traffic, e.g. moving vehicles, pedestrians, bikes · CPC title
Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes · CPC title
Characteristics · CPC title
Finite element generation, e.g. wire-frame surface description, {tesselation} · CPC title
Pedestrians · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.