Classification methods and systems
US-2019026588-A1 · Jan 24, 2019 · US
US12050285B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12050285-B2 |
| Application number | US-202217976581-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2022 |
| Priority date | Nov 21, 2019 |
| Publication date | Jul 30, 2024 |
| Grant date | Jul 30, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In various examples, a deep neural network(s) (e.g., a convolutional neural network) may be trained to detect moving and stationary obstacles from RADAR data of a three dimensional (3D) space. In some embodiments, ground truth training data for the neural network(s) may be generated from LIDAR data. More specifically, a scene may be observed with RADAR and LIDAR sensors to collect RADAR data and LIDAR data for a particular time slice. The RADAR data may be used for input training data, and the LIDAR data associated with the same or closest time slice as the RADAR data may be annotated with ground truth labels identifying objects to be detected. The LIDAR labels may be propagated to the RADAR data, and LIDAR labels containing less than some threshold number of RADAR detections may be omitted. The (remaining) LIDAR labels may be used to generate ground truth data.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: one or more processing units to perform one or more operations based at least on one or more outputs of a neural network, the neural network being trained, at least, by: receiving (i) RADAR data collected from a three-dimensional (3D) environment using a RADAR sensor and (ii) 3D data collected from the 3D environment using a different type of sensor than the RADAR sensor; projecting a first instance of the 3D data associated with a time slice to generate a first projection image representative of one or more 3D detections from the different type of sensor within the 3D environment; receiving one or more labels representative of the one or more 3D detections from the different type of sensor, the one or more labels identifying one or more locations in the first projection image corresponding to one or more objects represented by the one or more 3D detections; projecting a second instance of the RADAR data associated with the time slice to generate a RADAR projection image representative of one or more RADAR detections within the 3D environment; and updating one or more parameters of the neural network based at least on the RADAR projection image and ground truth data generated based at least on the one or more labels. 2. The processor of claim 1 , the neural network being trained, at least, by: encoding the RADAR projection image and a set of features representative of the one or more RADAR detections and corresponding reflection characteristics into a multi-channel RADAR data tensor; and updating the one or more parameters of the neural network using the multi-channel RADAR data tensor and the ground truth data as training data. 3. The processor of claim 1 , wherein the second instance of the RADAR data comprises accumulated, ego-motion-compensated RADAR detections. 4. The processor of claim 1 , wherein the ground truth data comprises a class confidence tensor and an instance regression tensor. 5. The processor of claim 1 , the ground truth data generated based at least on: propagating the one or more labels to the RADAR projection image to generate one or more propagated labels; determining a number of the one or more RADAR detections corresponding to at least one individual propagated label of the one or more propagated labels; and removing a set of the one or more propagated labels that contain less than a threshold number of the one or more RADAR detections. 6. The processor of claim 1 , the ground truth data generated based at least on: removing, from the one or more labels, a set of one or more propagated labels that contain less than a threshold number of the one or more RADAR detections, leaving a remaining set of one or more labels; and generating the ground truth data using the remaining set of one of more labels to generate one or more of location, size, or orientation data for the one or more objects represented by the one or more 3D detections, and encoding the one or more of location, size, or orientation data into one or more corresponding channels of an instance regression tensor. 7. The processor of claim 1 , wherein the one or more labels further identify one or more classes of the one or more objects represented by the one or more 3D detections in the first projection image, the ground truth data generated based at least on encoding classification data representative of the one or more classes of the one or more objects into one or more corresponding channels of a class confidence tensor. 8. The processor of claim 1 , wherein the one or more labels comprise one or more bounding boxes drawn around one or more stationary vehicles in the 3D environment, the neural network being trained, at least, using the one or more bounding boxes drawn around the one or more stationary vehicles to detect other stationary vehicles from input RADAR data. 9. The processor of claim 1 , wherein the first projection image is a projection of a 3D point cloud, wherein the one or more labels comprise a set of closed polylines drawn around each vehicle of one or more vehicles in the first projection image. 10. The processor of claim 1 , wherein the different type of sensor comprises a LIDAR sensor, an ultrasonic sensor, or a camera, wherein the 3D data collected from the 3D environment comprises LIDAR data, ultrasound data, 3D stereo camera data, or structure from motion depth estimation data. 11. The processor of claim 1 , wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system for performing remote operations; a system for performing streaming; a system for generating or presenting one or more of augmented reality content or virtual reality content; or a system implemented using a robot. 12. A system comprising: one or more processing units to perform one or more operations based at least on one or more outputs of a neural network, the neural network being trained, at least, by: receiving (i) RADAR data collected using a RADAR sensor and (ii) three-dimensional (3D) data collected using a different type of sensor than the RADAR sensor; for at least one frame of one or more frames of the 3D data: generating, using the at least one frame of 3D data, a first projection image representative of one or more 3D detections from the different type of sensor; receiving one or more labels representative of the one or more 3D detections from the different type of sensor, the one or more labels identifying one or more locations in the first projection image corresponding to one or more objects represented by the one or more 3D detections; generating, using a corresponding frame of the RADAR data associated with the at least one frame of 3D data, a RADAR projection image representative of one or more RADAR detections; and updating one or more parameters of the neural network based at least on the RADAR projection image and ground truth data generated based at least on the one or more labels. 13. The system of claim 12 , the ground truth data generated based at least on: propagating the one or more labels to the RADAR projection image to generate one or more propagated labels; determining a number of the one or more RADAR detections corresponding to at least one individual propagated label of the one or more propagated labels; removing a set of the one or more propagated labels that contain less than a threshold number of the one or more RADAR detections, leaving a remaining set of one or more labels; and generating the ground truth data using the remaining set of one or more labels. 14. The system of claim 12 , the neural network being trained, at least, by: encoding the RADAR projection image and a set of features representative of the one or more RADAR detections and corresponding reflection characteristics into a multi-channel RADAR data tensor; and updating the one or more parameters of the neural network using the multi-channel RADAR data tensor and the ground truth data as training data. 15. The system of claim 12 , wherein the ground truth data comprises a class confidence tensor and an instance regression tensor. 16. The system of claim 12 , the ground truth data generated using a set of the one or more labels to generate one or more of location, size, or orientation data for the one or more objects represented by the one or more 3D detections, and encoding the one or more of location,
for mapping or imaging · CPC title
Learning methods · CPC title
Architecture, e.g. interconnection topology · CPC title
Combination of radar systems with lidar systems · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.