Adaptive sound event classification

US11410677B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11410677-B2
Application numberUS-202017102724-A
CountryUS
Kind codeB2
Filing dateNov 24, 2020
Priority dateNov 24, 2020
Publication dateAug 9, 2022
Grant dateAug 9, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.

First claim

Opening claim text (preview).

What is claimed is: 1. A device comprising: one or more processors configured to: provide audio data samples to a sound event classification model; determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of the audio data samples was recognized by the sound event classification model; based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples; and based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples. 2. The device of claim 1 , further comprising a microphone coupled to the one or more processors and configured to capture audio data corresponding to the audio data samples. 3. The device of claim 1 , further comprising a memory coupled to the one or more processors and configured to store a plurality of sound event classification models, wherein the one or more processors are configured select the sound event classification model from among the plurality of sound event classification models. 4. The device of claim 3 , further comprising one or more sensors configured to generate sensor data associated with the audio data samples, wherein the one or more processors are configured to select the sound event classification model based on the sensor data. 5. The device of claim 4 , wherein the one or more sensors include a camera and a position sensor. 6. The device of claim 3 , further comprising one or more input devices configured to receive input identifying the audio scene, wherein the one or more processors are configured to select the sound event classification model based on the audio scene. 7. The device of claim 3 , wherein the one or more processors are configured to select the sound event classification model based on when the audio data samples are received. 8. The device of claim 3 , wherein the memory further stores settings data indicating one or more device settings, and wherein the one or more processors are configured to select the sound event classification model based on the settings data. 9. The device of claim 1 , wherein the one or more processors are further configured to generate, based on a determination that the sound class was recognized, output indicating the sound class associated with the audio data samples. 10. The device of claim 1 , wherein the one or more processors are further configured to, based on a determination that the sound event classification model does not correspond to the audio scene associated with the audio data samples, store audio data corresponding to the audio data samples as training data for a new sound event classification model. 11. The device of claim 1 , wherein the sound event classification model is further configured to generate a confidence metric associated with the output, and wherein the one or more processors are configured to determine whether the sound class was recognized by the sound event classification model based on the confidence metric. 12. The device of claim 1 , wherein the one or more processors are further configured update the sound event classification model based on the model update data. 13. The device of claim 1 , further comprising one or more input devices configured to receive input identifying the audio scene, wherein the one or more processors are configured to determine whether the sound event classification model corresponds to the audio scene based on the input. 14. The device of claim 1 , further comprising one or more sensors configured to generate sensor data associated with the audio data samples, wherein the one or more processors are configured to determine whether the sound event classification model corresponds to the audio scene based on the sensor data. 15. The device of claim 14 , wherein the one or more sensors include a camera and a position sensor. 16. The device of claim 14 , wherein the one or more processors are further configured to determine whether the sound event classification model corresponds to the audio scene based on a timestamp associated with the audio data samples. 17. The device of claim 1 , wherein the sound event classification model is trained to recognize a particular sound class and the model update data includes drift data representing a variation in characteristics of a sound within the particular sound class that the sound event classification model is not trained to recognize as corresponding to the particular sound class. 18. The device of claim 1 , wherein the one or more processors are integrated within a mobile computing device, a vehicle, a wearable device, an augmented reality headset, a mixed reality headset, or a virtual reality headset. 19. The device of claim 1 , wherein the one or more processors are included in an integrated circuit. 20. A method comprising: providing, by one or more processors, audio data samples as input to a sound event classification model; determining, by the one or more processors based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of the audio data samples was recognized by the sound event classification model; based on a determination that the sound class was not recognized, determining, by the one or more processors, whether the sound event classification model corresponds to an audio scene associated with the audio data samples; and based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, storing, by the one or more processors, model update data based on the audio data samples. 21. The method of claim 20 , further comprising selecting the sound event classification model from among a plurality of sound event classification models stored at a memory coupled to the one or more processors. 22. The method of claim 21 , wherein the sound event classification model is selected based on user input, settings data, location data, image data, video data, a timestamp associated with the audio data samples, or a combination thereof. 23. The method of claim 20 , wherein a determination of whether the sound event classification model corresponds to the audio scene is based on a confidence metric generated by the sound event classification model, user input, settings data, location data, image data, video data, a timestamp associated with the audio data samples, or a combination thereof. 24. The method of claim 20 , further comprising, after storing the model update data: determining whether a threshold quantity of model update data has been accumulated, and based on a determination that the threshold quantity of model update data has been accumulated, initiating an automatic update of the sound event classification model using accumulated model update data. 25. The method of claim 24 , wherein before the automatic update, the sound event classification model was trained to recognize multiple variants of a particular sound class, and wherein the automatic update modifies the sound event classification model to enable the sound event classification model to recognize an additional variant of the particular sound class as corresponding to the particular sound class.

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • Mouthpieces; {Microphones;} Attachments therefor · CPC title

  • G10L25/51Primary

    for comparison or discrimination · CPC title

  • Inference or reasoning models · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11410677B2 cover?
A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one o…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G10L25/51. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 09 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).