Detection and classification of siren signals and localization of siren signal sources

US11804239B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11804239-B2
Application numberUS-202217713008-A
CountryUS
Kind codeB2
Filing dateApr 4, 2022
Priority dateJan 24, 2020
Publication dateOct 31, 2023
Grant dateOct 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an embodiment, a method comprises: capturing, by one or more microphone arrays of a vehicle, sound signals in an environment; extracting frequency spectrum features from the sound signals; predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications; converting the one or more siren signal classifications into one or more siren signal event detections; computing time delay of arrival estimates for the one or more detected siren signals; estimating one or more bearing angles to one or more sources of the one or more detected siren signals using the time delay of arrival estimates and a known geometry of the microphone array; and tracking, using a Bayesian filter, the one or more bearing angles. If a siren is detected, actions are performed by the vehicle depending on the location of the emergency vehicle and whether the emergency vehicle is active or inactive.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: capturing, by one or more microphone arrays of an autonomous vehicle (AV), sound signals in an environment; extracting, using one or more processors, frequency spectrum features from the sound signals; predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, wherein the acoustic scene classifier predicts labels that indicate a presence of one or more different siren signal types from a plurality of different types of siren signals used by emergency vehicles; converting, using the one or more processors, the one or more siren signal classifications into one or more siren signal event detections; computing time delay of arrival estimates for the one or more siren signal event detections; estimating, using the one or more processors, a plurality of bearing angles to one or more sources of the one or more siren signal event detections using the time delay of arrival estimates and a known geometry of the one or more microphone arrays; and tracking, using a Bayesian filter, the plurality of bearing angles; wherein the plurality of bearing angles are used to triangulate the location of the one or more sources of the one or more siren signal event detections; the method further comprising: fusing, using the one or more processors, the location of the one or more sources of the one or more siren signal event detections with sensor data from a perception system of the AV; and controlling, using the one or more processors, the AV based on the location of the one or more sources of the one or more siren signals and the sensor data from the perception system. 2. The method of claim 1 , wherein the time delay of arrival estimates are computed using a maximum likelihood criterion obtained by implementing a generalized cross correlation method. 3. The method of claim 1 , further comprising estimating one or more ranges of the one or more siren signal sources by applying triangularization to the one or more bearing angles. 4. The method of claim 1 , wherein transforming sound signals into frequency spectrum features includes generating one of a spectrogram, mel-spectrogram or mel-frequency cepstral coefficients (MFCC). 5. The method of claim 1 , wherein the acoustic scene classifier is implemented at least in part using a convolutional neural network (CNN). 6. The method of claim 1 , wherein the Bayesian filter is one of a Kalman filter, extended Kalman filter (EKF), unscented Kalman filter or particle filter. 7. The method of claim 1 , wherein predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, further comprises continuously predicting labels indicating the presence or absence of the one or more types of siren signals. 8. The method of claim 1 , wherein the one or more bearing angles are estimated by using a spatio-temporal difference of the one or more siren signal event detections at each microphone pair in the one or more microphone arrays. 9. The method of claim 1 , wherein the different types of siren signals include one or more of wailing, yelp, hi-lo, rumbler, chirp, pulsar, localizer, and mechanical wail siren signals. 10. The method of claim 1 , wherein the one or more sources of the one or more siren signal event detections is associated with an emergency vehicle and the method further comprises causing, using the one or more processors and the location of the one or more sources of the one or more siren signal event detections, to operate the AV in accordance with one or more rules associated with emergency vehicles. 11. The method of claim 10 , further comprising determining whether the emergency vehicle is active or inactive, and if active, operating the AV in accordance with a first set of rules associated with active emergency vehicles, or if inactive, operating the AV in accordance with a second set of rules associated with an inactive emergency vehicles. 12. The method of claim 11 , wherein if the emergency vehicle is active and nearby and the AV has crossed a stop line at an intersection, causing the AV to initiate a comfort stop, or if the emergency vehicle is active and far away and the AV has crossed the stop line, causing the AV to traverse across the intersection and then initiate a comfort stop. 13. The method of claim 11 , wherein the emergency vehicle is active and the AV is within a left lane, and a right lane is open and available, causing the AV to merge into the right lane. 14. The method of claim 11 , wherein the emergency vehicle is active and the AV is within a rightmost lane, causing the AV to bias to a right-hand direction, if possible but not cross a right-hand lane boundary, and then initiate a comfort stop and remain stopped until the following conditions are met: 1) The emergency vehicle is traveling away from the AV with a range rate that is greater than a specified speed for greater than a specified time; and 2) The emergency vehicle range is greater than a specified distance, or the emergency vehicle is no longer detected for greater than a specified time, and if the conditions are met causing the AV to resume its route towards a goal point. 15. The method of claim 11 , wherein the emergency vehicle is inactive, the method further comprising: determining whether the AV is on a same road as the emergency vehicle; determining whether the AV is on a same side of the road or an opposite side of the road as the emergency vehicle; if on the same road as the emergency vehicle, determining whether the AV is in in front or of behind the emergency vehicle; and causing the AV to travel a trajectory to avoid collision with the emergency vehicle based on whether the AV is on the same road or a different road, the same side of road or the opposite side, and if on the same side of the road whether the AV is in front of or behind the emergency vehicle. 16. The method of claim 11 , wherein the emergency vehicle is inactive, further comprising: determining that the AV is on a road with single lane, and the emergency vehicle is located fully on a shoulder area and fully or partly within a lane in which the AV is traveling; and causing the AV to initiate a comfort stop; and causing the AV to remain stopped until the emergency vehicle is located fully on the shoulder area, and then to proceed with a maximum speed limit or switch to active and travel away from the emergency vehicle. 17. The method of claim 1 , wherein the frequency spectrum features comprises an oscillation frequency of the sound signals, and the acoustic scene classifier predicts labels that indicate the presence of one or more of a plurality of different types of siren signals based, at least in part, on the oscillation frequency. 18. An autonomous vehicle (AV) comprising: one or more microphone arrays; one or more processors; and memory storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations comprising: capturing, by the one or more microphone arrays of the AV, sound signals in an environment; extracting, using the one or more processors, frequency spectrum features from the sound signals; predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications, wherein the acoustic scene classifier predicts labels that indicate the presence of one or more different siren signal types from a plurality of different types of siren signals used by emergen

Assignees

Inventors

Classifications

  • Acoustic transducers and sound field adaptation in vehicles · CPC title

  • specially adapted for specific applications · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • microphones · CPC title

  • G10L25/51Primary

    for comparison or discrimination · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11804239B2 cover?
In an embodiment, a method comprises: capturing, by one or more microphone arrays of a vehicle, sound signals in an environment; extracting frequency spectrum features from the sound signals; predicting, using an acoustic scene classifier and the frequency spectrum features, one or more siren signal classifications; converting the one or more siren signal classifications into one or more siren …
Who is the assignee on this patent?
Motional Ad Llc
What technology area does this patent fall under?
Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).