Deep neural network for detecting obstacle instances using radar sensors in autonomous machine applications

US12050285B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12050285-B2
Application numberUS-202217976581-A
CountryUS
Kind codeB2
Filing dateOct 28, 2022
Priority dateNov 21, 2019
Publication dateJul 30, 2024
Grant dateJul 30, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In various examples, a deep neural network(s) (e.g., a convolutional neural network) may be trained to detect moving and stationary obstacles from RADAR data of a three dimensional (3D) space. In some embodiments, ground truth training data for the neural network(s) may be generated from LIDAR data. More specifically, a scene may be observed with RADAR and LIDAR sensors to collect RADAR data and LIDAR data for a particular time slice. The RADAR data may be used for input training data, and the LIDAR data associated with the same or closest time slice as the RADAR data may be annotated with ground truth labels identifying objects to be detected. The LIDAR labels may be propagated to the RADAR data, and LIDAR labels containing less than some threshold number of RADAR detections may be omitted. The (remaining) LIDAR labels may be used to generate ground truth data.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: one or more processing units to perform one or more operations based at least on one or more outputs of a neural network, the neural network being trained, at least, by: receiving (i) RADAR data collected from a three-dimensional (3D) environment using a RADAR sensor and (ii) 3D data collected from the 3D environment using a different type of sensor than the RADAR sensor; projecting a first instance of the 3D data associated with a time slice to generate a first projection image representative of one or more 3D detections from the different type of sensor within the 3D environment; receiving one or more labels representative of the one or more 3D detections from the different type of sensor, the one or more labels identifying one or more locations in the first projection image corresponding to one or more objects represented by the one or more 3D detections; projecting a second instance of the RADAR data associated with the time slice to generate a RADAR projection image representative of one or more RADAR detections within the 3D environment; and updating one or more parameters of the neural network based at least on the RADAR projection image and ground truth data generated based at least on the one or more labels. 2. The processor of claim 1 , the neural network being trained, at least, by: encoding the RADAR projection image and a set of features representative of the one or more RADAR detections and corresponding reflection characteristics into a multi-channel RADAR data tensor; and updating the one or more parameters of the neural network using the multi-channel RADAR data tensor and the ground truth data as training data. 3. The processor of claim 1 , wherein the second instance of the RADAR data comprises accumulated, ego-motion-compensated RADAR detections. 4. The processor of claim 1 , wherein the ground truth data comprises a class confidence tensor and an instance regression tensor. 5. The processor of claim 1 , the ground truth data generated based at least on: propagating the one or more labels to the RADAR projection image to generate one or more propagated labels; determining a number of the one or more RADAR detections corresponding to at least one individual propagated label of the one or more propagated labels; and removing a set of the one or more propagated labels that contain less than a threshold number of the one or more RADAR detections. 6. The processor of claim 1 , the ground truth data generated based at least on: removing, from the one or more labels, a set of one or more propagated labels that contain less than a threshold number of the one or more RADAR detections, leaving a remaining set of one or more labels; and generating the ground truth data using the remaining set of one of more labels to generate one or more of location, size, or orientation data for the one or more objects represented by the one or more 3D detections, and encoding the one or more of location, size, or orientation data into one or more corresponding channels of an instance regression tensor. 7. The processor of claim 1 , wherein the one or more labels further identify one or more classes of the one or more objects represented by the one or more 3D detections in the first projection image, the ground truth data generated based at least on encoding classification data representative of the one or more classes of the one or more objects into one or more corresponding channels of a class confidence tensor. 8. The processor of claim 1 , wherein the one or more labels comprise one or more bounding boxes drawn around one or more stationary vehicles in the 3D environment, the neural network being trained, at least, using the one or more bounding boxes drawn around the one or more stationary vehicles to detect other stationary vehicles from input RADAR data. 9. The processor of claim 1 , wherein the first projection image is a projection of a 3D point cloud, wherein the one or more labels comprise a set of closed polylines drawn around each vehicle of one or more vehicles in the first projection image. 10. The processor of claim 1 , wherein the different type of sensor comprises a LIDAR sensor, an ultrasonic sensor, or a camera, wherein the 3D data collected from the 3D environment comprises LIDAR data, ultrasound data, 3D stereo camera data, or structure from motion depth estimation data. 11. The processor of claim 1 , wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system for performing remote operations; a system for performing streaming; a system for generating or presenting one or more of augmented reality content or virtual reality content; or a system implemented using a robot. 12. A system comprising: one or more processing units to perform one or more operations based at least on one or more outputs of a neural network, the neural network being trained, at least, by: receiving (i) RADAR data collected using a RADAR sensor and (ii) three-dimensional (3D) data collected using a different type of sensor than the RADAR sensor; for at least one frame of one or more frames of the 3D data: generating, using the at least one frame of 3D data, a first projection image representative of one or more 3D detections from the different type of sensor; receiving one or more labels representative of the one or more 3D detections from the different type of sensor, the one or more labels identifying one or more locations in the first projection image corresponding to one or more objects represented by the one or more 3D detections; generating, using a corresponding frame of the RADAR data associated with the at least one frame of 3D data, a RADAR projection image representative of one or more RADAR detections; and updating one or more parameters of the neural network based at least on the RADAR projection image and ground truth data generated based at least on the one or more labels. 13. The system of claim 12 , the ground truth data generated based at least on: propagating the one or more labels to the RADAR projection image to generate one or more propagated labels; determining a number of the one or more RADAR detections corresponding to at least one individual propagated label of the one or more propagated labels; removing a set of the one or more propagated labels that contain less than a threshold number of the one or more RADAR detections, leaving a remaining set of one or more labels; and generating the ground truth data using the remaining set of one or more labels. 14. The system of claim 12 , the neural network being trained, at least, by: encoding the RADAR projection image and a set of features representative of the one or more RADAR detections and corresponding reflection characteristics into a multi-channel RADAR data tensor; and updating the one or more parameters of the neural network using the multi-channel RADAR data tensor and the ground truth data as training data. 15. The system of claim 12 , wherein the ground truth data comprises a class confidence tensor and an instance regression tensor. 16. The system of claim 12 , the ground truth data generated using a set of the one or more labels to generate one or more of location, size, or orientation data for the one or more objects represented by the one or more 3D detections, and encoding the one or more of location,

Assignees

Inventors

Classifications

  • for mapping or imaging · CPC title

  • Learning methods · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Combination of radar systems with lidar systems · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12050285B2 cover?
In various examples, a deep neural network(s) (e.g., a convolutional neural network) may be trained to detect moving and stationary obstacles from RADAR data of a three dimensional (3D) space. In some embodiments, ground truth training data for the neural network(s) may be generated from LIDAR data. More specifically, a scene may be observed with RADAR and LIDAR sensors to collect RADAR data an…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G01S7/417. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 30 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).