Camera-radar data fusion for efficient object detection

US2023260266A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023260266-A1
Application numberUS-202318108749-A
CountryUS
Kind codeA1
Filing dateFeb 13, 2023
Priority dateFeb 15, 2022
Publication dateAug 17, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes obtaining, by a processing device, input data derived from a set of sensors associated with an autonomous vehicle (AV), extracting, by the processing device from the input data, a plurality of sets of features, generating, by the processing device using the plurality of sets of features, a fused bird's-eye view (BEV) grid. The fused BEV grid is generated based on a first BEV grid having a first scale and a second BEV grid having a second scale different from the first scale. The method further includes providing, by the processing device, the fused BEV grid for object detection.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining, by a processing device, input data derived from a set of sensors associated with an autonomous vehicle (AV); extracting, by the processing device from the input data, a plurality of sets of features; generating, by the processing device using the plurality of sets of features, a fused bird's-eye view (BEV) grid, wherein the fused BEV grid is generated based on a first BEV grid having a first scale and a second BEV grid having a second scale different from the first scale; and providing, by the processing device, the fused BEV grid for object detection. 2 . The method of claim 1 , wherein: the set of sensors comprises at least one camera and at least one radar; the input data comprises a set of camera data obtained from the at least one camera and a set of radar data obtained from the at least one radar; and the plurality of sets of features comprises a set of camera data features generated from the set of camera data and a set of radar data features generated from the set of radar data. 3 . The method of claim 1 , wherein generating the fused BEV grid further comprises: associating each set of features of the plurality of sets of features with a respective set of points; generating, using each set of points, a set of BEV grids, the set of BEV grids comprising the first BEV grid and the second BEV grid; extracting, for each BEV grid of the set of BEV grids, a respective set of BEV grid features; generating, for each BEV grid of the set of BEV grids using the respective set of BEV grid features, a resampled BEV grid, wherein the first BEV grid is associated with a first resampled BEV grid and wherein the second BEV grid is associated with a second resampled BEV grid; and fusing each resampled BEV grid to generate the fused BEV grid. 4 . The method of claim 3 , wherein associating each set of features of the plurality of sets of features with a respective set of points further comprises: transforming a set of camera features of the plurality of sets of features into a set of pixel points; and transforming a set of radar features of the plurality of sets of features into a set of radar points, including transforming from a polar coordinate representation to a Cartesian coordinate representation. 5 . The method of claim 1 , further comprising performing, by the processing device using the fused BEV grid, the object detection to identify at least one object using a set of neural networks. 6 . The method of claim 5 , wherein performing object detection further comprises: obtaining a set of predictions generated using the fused BEV grid, wherein the set of predictions comprises a heatmap prediction and an attribute prediction; generating, from the set of predictions, a set of candidate bounding boxes, each candidate bounding box of the set of candidate bounding boxes corresponding to the at least one object; and selecting, from the set of candidate bounding boxes, at least one bounding box corresponding to the at least one object. 7 . The method of claim 5 , further comprising causing, by the processing device, a driving path of the AV to be modified in view of the at least one object. 8 . A system comprising: a memory; and a processing device communicative coupled to the memory, the processing device configured to: obtain input data derived from a set of sensors associated with an autonomous vehicle (AV); extract, from the input data, a plurality of sets of features; generate, using the plurality of sets of features, a fused bird's-eye view (BEV) grid, wherein the fused BEV grid is generated based on a first BEV grid having a first scale and a second BEV grid having a second scale different from the first scale; and provide the fused BEV grid for object detection. 9 . The system of claim 8 , wherein: the set of sensors comprises at least one camera and at least one radar; the input data comprises a set of camera data obtained from the at least one camera and a set of radar data obtained from the at least one radar; and the plurality of sets of features comprises a set of camera data features generated from the set of camera data and a set of radar data features generated from the set of radar data. 10 . The system of claim 8 , wherein, to generate the fused BEV grid, the processing device is further configured to: associate each set of features of the plurality of sets of features with a respective set of points; generate, using each set of points, a set of BEV grids, the set of BEV grids comprising the first BEV grid and the second BEV grid; extract, for each BEV grid of the set of BEV grids, a respective set of BEV grid features; generate, for each BEV grid of the set of BEV grids using the respective set of BEV grid features, a resampled BEV grid, wherein the first BEV grid is associated with a first resampled BEV grid and wherein the second BEV grid is associated with a second resampled BEV grid; and fuse each resampled BEV grid to generate the fused BEV grid. 11 . The system of claim 10 , wherein, to associate each set of features of the plurality of sets of features with a respective set of points, the processing device is further configured to: transform a set of camera features of the plurality of sets of features into a set of pixel points; and transform a set of radar features of the plurality of sets of features into a set of radar points by transforming from a polar coordinate representation to a Cartesian coordinate representation. 12 . The system of claim 8 , wherein the processing device is further configured to perform, using the fused BEV grid, the object detection to identify at least one object using a set of neural networks. 13 . The system of claim 12 , wherein, to perform object detection, the processing device is further configured to: obtain a set of predictions generated using the fused BEV grid, wherein the set of predictions comprises a heatmap prediction and an attribute prediction; generate, from the set of predictions, a set of candidate bounding boxes, each candidate bounding box of the set of candidate bounding boxes corresponding to the at least one object; and select, from the set of candidate bounding boxes, at least one bounding box corresponding to the at least one object. 14 . The system of claim 12 , wherein the processing device is further configured to cause a driving path of the AV to be modified in view of the at least one object. 15 . A non-transitory computer-readable storage medium having instructions stored thereon that, when executed by a processing device, cause the processing device to perform operations comprising: obtaining input data derived from a set of sensors associated with an autonomous vehicle (AV), wherein the set of sensors comprises at least one camera and at least one radar, and wherein the input data comprises a set of camera data obtained from the at least one camera and a set of radar data obtained from the at least one radar; extracting, from the input data, a plurality of sets of features, wherein the plurality of sets of features comprises a set of camera data features generated from the set of camera data and a set of radar data features generated from the set of radar data; generating, using the plurality of sets of features, a fused bird's-eye view (BEV) grid, wherein the fused BEV grid is generated based on a first BEV grid having a first scale and a second BEV grid having a second scale different from the first scale; and providing the fused BEV grid for object detection. 16 . T

Assignees

Inventors

Classifications

  • Image mosaicing, e.g. composing plane images from plane sub-images · CPC title

  • adapted for simultaneous range and velocity measurements · CPC title

  • G01S13/867Primary

    Combination of radar systems with cameras · CPC title

  • of land vehicles · CPC title

  • involving the use of neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023260266A1 cover?
A method includes obtaining, by a processing device, input data derived from a set of sensors associated with an autonomous vehicle (AV), extracting, by the processing device from the input data, a plurality of sets of features, generating, by the processing device using the plurality of sets of features, a fused bird's-eye view (BEV) grid. The fused BEV grid is generated based on a first BEV g…
Who is the assignee on this patent?
Waymo Llc
What technology area does this patent fall under?
Primary CPC classification G01S13/867. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).