Sensor fusion for autonomous machine applications using machine learning

US11688181B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11688181-B2
Application numberUS-202117353231-A
CountryUS
Kind codeB2
Filing dateJun 21, 2021
Priority dateJun 25, 2020
Publication dateJun 27, 2023
Grant dateJun 27, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In various examples, a multi-sensor fusion machine learning model—such as a deep neural network (DNN)—may be deployed to fuse data from a plurality of individual machine learning models. As such, the multi-sensor fusion network may use outputs from a plurality of machine learning models as input to generate a fused output that represents data from fields of view or sensory fields of each of the sensors supplying the machine learning models, while accounting for learned associations between boundary or overlap regions of the various fields of view of the source sensors. In this way, the fused output may be less likely to include duplicate, inaccurate, or noisy data with respect to objects or features in the environment, as the fusion network may be trained to account for multiple instances of a same object appearing in different input representations.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: one or more circuits to: receive first data representative of a plurality of outputs of a plurality of deep neural networks (DNNs), at least one output of the plurality of outputs corresponding to a respective sensor having a respective field of view different from fields of view corresponding to one or more others sensors of a plurality of sensors of an autonomous machine; compute, using a fusion DNN and based at least in part on the first data, second data representative of a fusion of the plurality of outputs; and perform one or more operations using the autonomous machine based at least in part on the second data. 2. The processor of claim 1 , wherein the computing the second data is further based at least in part on third data representative of at least one probability distribution function corresponding to at least one point of at least one of the plurality of outputs, the at least one point corresponding to a detected object and the at least one probability distribution function corresponding to one or more potential locations of the detected object. 3. The processor of claim 1 , wherein the computing the second data is further based at least in part on third data representative of one or more velocity representations including encoded values corresponding to at least one of a velocity in an x-direction or a velocity in a y-direction. 4. The processor of claim 1 , wherein the computing the second data is further based at least in part on third data representative of one or more representations corresponding to at least one of object instances or object appearances determined using the plurality of outputs. 5. The processor of claim 1 , wherein each output of the plurality of outputs includes a rasterized image representing one or more objects, and the the plurality of outputs includes a fused rasterized image. 6. The processor of claim 5 , wherein the one or more objects include at least one of a vehicle, a pedestrian, a bicyclist, a motorist, a lane marker, a road boundary marker, a freespace boundary, or a wait line. 7. The processor of claim 1 , wherein: a first output of the plurality of outputs corresponds to a first field of view; a second output of the plurality of outputs corresponds to a second field of view different from the first field of view; and the fusion of the plurality of outputs corresponds to both the first field of view and the second field of view. 8. The processor of claim 7 , wherein the first field of view and the second field of view are at least partially overlapping. 9. The processor of claim 1 , wherein the first data is further representative of one or more additional outputs generated using a LiDAR sensor, a RADAR sensor, or an ultrasonic sensor, and the one or more additional outputs are generated using another DNN or without using another DNN. 10. The processor of claim 1 , wherein: a first output of the plurality of outputs includes a first representation of an object; a second output of the plurality of outputs includes a second representation of the object; and the fusion of the plurality of outputs includes a fused representation of the object. 11. The processor of claim 1 , wherein the processor is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system implemented using an edge device; a system implemented using a robot; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 12. A system comprising: one or more processing units; and one or more memory units storing instructions that, when executed by the one or more processing units, cause the one or more processing units to execute operations comprising: receiving first data representative of at least a first rasterized image generated using a first deep neural network (DNN) and based at least in part on first sensor data generated using a first sensor, the first rasterized image including at least a first object; receiving second data representative of at least a second rasterized image generated using a second deep neural network (DNN) and based at least in part on second sensor data generated using a second sensor, the second rasterized image including at a least a second object; computing, using a fusion DNN and based at least in part on the first data and the second data, third data representative of a fused rasterized image including both the first object and the second object; and performing one or more operations using an autonomous machine based at least in part on the third data. 13. The system of claim 12 , wherein the first sensor and the second sensor include one of an image sensor, a LiDAR sensor, a RADAR sensor, or an ultrasonic sensor. 14. The system of claim 12 , wherein the first sensor and the second sensor include at least partially overlapping fields of view, the first rasterized image includes a first representation of a third object, the second rasterized image includes a second representation of the third object, and the fused rasterized image includes a fused representation of the third object. 15. The system of claim 12 , wherein the operations further comprise: receiving fourth data representative of at least one probability distribution function corresponding to at least one pixel of at least one of the first rasterized image or the second rasterized image, the at least one pixel corresponding to at least one of the first object or the second object, and the at least one probability distribution function corresponding to one or more potential locations of an detected object, wherein the computing the third data is further based at least in part on the fourth data. 16. The system of claim 12 , wherein the operations further comprise: receiving fourth data representative of one or more velocity representations including encoded values corresponding to at least one of a velocity in an x-direction or a velocity in a y-direction, wherein the computing the third data is further based at least in part on the fourth data. 17. The system of claim 12 , wherein the system is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing deep learning operations; a system implemented using an edge device; a system implemented using a robot; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 18. A method comprising: receiving first data representative of at least a first rasterized image generated based at least in part on first sensor data generated using a first sensor of a first type, the first rasterized image including at least a first object; receiving second data representative of at least a second rasterized image generated based at least in part on second sensor data generated using a second sensor of a second type different from the first type, the second rasterized image including at a least a second object; computing, using a fusion deep neural network (DNN) and based at least in part

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11688181B2 cover?
In various examples, a multi-sensor fusion machine learning model—such as a deep neural network (DNN)—may be deployed to fuse data from a plurality of individual machine learning models. As such, the multi-sensor fusion network may use outputs from a plurality of machine learning models as input to generate a fused output that represents data from fields of view or sensory fields of each of the…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06V20/588. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 27 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).