Sound source detection and localization for autonomous driving vehicle

US11430466B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11430466-B2
Application numberUS-202117248196-A
CountryUS
Kind codeB2
Filing dateJan 13, 2021
Priority dateJan 13, 2021
Publication dateAug 30, 2022
Grant dateAug 30, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for sound source detection and localization utilizing an autonomous driving vehicle (ADV) are disclosed. The method includes receiving audio data from a number of audio sensors mounted on the ADV. The audio data comprises sounds captured by the audio sensors and emitted by one or more sound sources. Based on the received audio data, the method further includes determining a number of sound source information. Each sound source information comprises a confidence score associated with an existence of a specific sound. The method further includes generating a data representation to report whether there exists the specific sound within the driving environment of the ADV. The data representation comprises the determined sound source information. The received audio data and the generated data representation are utilized to subsequently train a machine learning algorithm to recognize the specific sound source during autonomous driving of the ADV in real-time.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for sound source detection and localization utilizing an autonomous driving vehicle (ADV) while the ADV is operating within a driving environment, the method comprising: receiving audio data from a plurality of audio sensors mounted on the ADV, the audio data comprising sounds captured by the plurality of audio sensors and emitted by one or more sound sources; based on the received audio data, determining a plurality of sound source information, each sound source information comprising a confidence score associated with an existence of a specific sound; and generating a data representation to report whether there exists the specific sound within the driving environment of the ADV, the data representation comprising the determined plurality of sound source information; wherein the received audio data and the generated data representation are utilized to subsequently train a machine learning algorithm to recognize a specific sound source during autonomous driving of the ADV in real-time. 2. The method of claim 1 , wherein determining the plurality of sound source information comprises performing sound source localization with the plurality of audio sensors to determine at least one of: directions of the sound sources relative to their corresponding audio sensors, distances between the sound sources and their corresponding audio sensors, relative positions of the captured sounds, absolute positions of the captured sounds, approaching/departing statuses of the captured sounds, or intensities of the captured sounds associated with current timestamps. 3. The method of claim 2 , wherein each sound source information further comprises at least one of: a direction of a sound source relative to a corresponding audio sensor, a distance between the sound source and the corresponding audio sensor, a relative position of a captured sound, an absolute position of a captured sound, an approaching/departing status of a captured sound, or an intensity of a captured sound associated with a current timestamp. 4. The method of claim 3 , wherein the data representation is a grid including a plurality of regions that collectively cover the driving environment of the ADV, each region corresponding to an audio sensor from the plurality of audio sensors and reporting a vector of results indicating whether the specific sound exists in the region, the vector of results including a region identifier (ID) and one sound source information. 5. The method of claim 4 , wherein each region is configured to partially cover a particular size within the driving environment. 6. The method of claim 1 , wherein the sound sources are emergency vehicles, and the specific sound is a siren sound. 7. The method of claim 1 , wherein the confidence score is within a range of 0-1 value. 8. The method of claim 4 , wherein a center of the grid represents a position of the ADV. 9. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations, the operations comprising: receiving audio data from a plurality of audio sensors mounted on an autonomous driving vehicle (ADV), the audio data comprising sounds captured by the plurality of audio sensors and emitted by one or more sound sources; based on the received audio data, determining a plurality of sound source information, each sound source information comprising a confidence score associated with an existence of a specific sound; and generating a data representation to report whether there exists the specific sound within a driving environment of the ADV, the data representation comprising the determined plurality of sound source information; wherein the received audio data and the generated data representation are utilized to subsequently train a machine learning algorithm to recognize a specific sound source during autonomous driving of the ADV in real-time. 10. The non-transitory machine-readable medium of claim 9 , wherein determining the plurality of sound source information comprises performing sound source localization with the plurality of audio sensors to determine at least one of: directions of the sound sources relative to their corresponding audio sensors, distances between the sound sources and their corresponding audio sensors, relative positions of the captured sounds, absolute positions of the captured sounds, approaching/departing statuses of the captured sounds, or intensities of the captured sounds associated with current timestamps. 11. The non-transitory machine-readable medium of claim 10 , wherein each sound source information further comprises at least one of: a direction of a sound source relative to a corresponding audio sensor, a distance between the sound source and the corresponding audio sensor, a relative position of a captured sound, an absolute position of a captured sound, an approaching/departing status of a captured sound, or an intensity of a captured sound associated with a current timestamp. 12. The non-transitory machine-readable medium of claim 11 , wherein the data representation is a grid including a plurality of regions that collectively cover the driving environment of the ADV, each region corresponding to an audio sensor from the plurality of audio sensors and reporting a vector of results indicating whether the specific sound exists in the region, the vector of results including a region identifier (ID) and one sound source information. 13. The non-transitory machine-readable medium of claim 12 , wherein each region is configured to partially cover a particular size within the driving environment. 14. The non-transitory machine-readable medium of claim 9 , wherein the sound sources are emergency vehicles, and the specific sound is a siren sound. 15. The non-transitory machine-readable medium of claim 9 , wherein the confidence score is within a range of 0-1 value. 16. The non-transitory machine-readable medium of claim 12 , wherein a center of the grid represents a position of the ADV. 17. A system for sound source detection and localization, comprising: a processor; and a memory coupled to the processor to store instructions, which when executed by the processor, cause the processor to perform operations, the operations including receiving audio data from a plurality of audio sensors mounted on an autonomous driving vehicle (ADV), the audio data comprising sounds captured by the plurality of audio sensors and emitted by one or more sound sources; based on the received audio data, determining a plurality of sound source information, each sound source information comprising a confidence score associated with an existence of a specific sound; and generating a data representation to report whether there exists the specific sound within a driving environment of the ADV, the data representation comprising the determined plurality of sound source information; wherein the received audio data and the generated data representation are utilized to subsequently train a machine learning algorithm to recognize a specific sound source during autonomous driving of the ADV in real-time. 18. The system of claim 17 , wherein determining the plurality of sound source information comprises performing sound source localization with the plurality of audio sensors to determine at least one of: directions of the sound sources relative to their corresponding audio sensors, distances between the sound sources and their corresponding audio sensors, relative positions of the captured sound

Assignees

Inventors

Classifications

  • Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements (G01S5/28 takes precedence) · CPC title

  • Position of source determined by a plurality of spaced direction-finders · CPC title

  • G01H11/06Primary

    by electric means · CPC title

  • Machine learning · CPC title

  • using ultrasonic, sonic or infrasonic waves · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11430466B2 cover?
Systems and methods for sound source detection and localization utilizing an autonomous driving vehicle (ADV) are disclosed. The method includes receiving audio data from a number of audio sensors mounted on the ADV. The audio data comprises sounds captured by the audio sensors and emitted by one or more sound sources. Based on the received audio data, the method further includes determining a …
Who is the assignee on this patent?
Baidu Usa Llc
What technology area does this patent fall under?
Primary CPC classification G01H11/06. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 30 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).