Multilingual wakeword detection

US11996097B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11996097-B2
Application numberUS-202117359937-A
CountryUS
Kind codeB2
Filing dateJun 28, 2021
Priority dateMay 6, 2019
Publication dateMay 28, 2024
Grant dateMay 28, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving audio data representing speech; determining, using a first wakeword-detection component, that the speech includes a wakeword; determining, using the first wakeword-detection component, language data indicating that the speech corresponds to a first spoken language; determining, using a second wakeword-detection component and the language data, that the speech includes the wakeword; and based at least in part on the second wakeword-detection component determining that the speech includes the wakeword, causing speech processing to be performed using the audio data. 2. The computer-implemented method of claim 1 , wherein: the first wakeword-detection component and the second wakeword-detection component are located on a first device; and causing speech processing to be performed using the audio data comprises sending the audio data to a speech processing component on the first device. 3. The computer-implemented method of claim 1 , wherein: the first wakeword-detection component and the second wakeword-detection component are located on a first device; and causing speech processing to be performed using the audio data comprises sending the audio data to a speech processing component on a second device. 4. The computer-implemented method of claim 1 , further comprising: determining audio feature data corresponding to the audio data, the audio feature data representing at least one audio feature of the audio data; and processing, using a classifier, the audio feature data to determine that the speech includes the wakeword. 5. The computer-implemented method of claim 4 , wherein the first wakeword-detection component comprises the classifier. 6. The computer-implemented method of claim 1 , further comprising: determining, using a voice-activity detection (VAD) component, that the audio data represents the speech; based at least in part on determining that the audio data represents the speech, activating the first wakeword-detection component; and based at least in part on determining, using the first wakeword-detection component, that the speech includes the wakeword, activating the second wakeword-detection component. 7. The computer-implemented method of claim 1 , further comprising: based at least in part on the language data, selecting first data to be used by the second wakeword-detection component. 8. The computer-implemented method of claim 1 , further comprising: sending an indication of the first spoken language to a speech processing component. 9. The computer-implemented method of claim 1 , wherein determining the language data comprises: determining audio feature data corresponding to the audio data, the audio feature data representing at least one audio feature of the audio data; and processing, using a classifier, the audio feature data to determine the language data. 10. The computer-implemented method of claim 1 , wherein determining, using the second wakeword-detection component, that the speech includes the wakeword comprises: determining, using an acoustic model and the audio data, acoustic data corresponding to acoustic units representing the first spoken language; processing the acoustic data using a hidden Markov model corresponding to the first spoken language to determine a wakeword hypothesis; and processing the wakeword hypothesis with a classifier corresponding to the first spoken language to determine that the speech includes the wakeword. 11. A system comprising: at least one processor; and at least one memory comprising instructions that, when executed by the at least one processor, cause the system to: receive audio data representing speech; determine, using a first wakeword-detection component, that the speech includes a wakeword; determine, using the first wakeword-detection component, language data indicating that the speech corresponds to a first spoken language; determine, using a second wakeword-detection component and the language data, that the speech includes the wakeword; and based at least in part on the second wakeword-detection component determining that the speech includes the wakeword, cause speech processing to be performed using the audio data. 12. The system of claim 11 , wherein: the first wakeword-detection component and the second wakeword-detection component are located on a first device; and the instructions that cause the system to cause speech processing to be performed using the audio data comprise instructions that, when executed by the at least one processor, cause the system to send the audio data to a speech processing component on the first device. 13. The system of claim 11 , wherein: the first wakeword-detection component and the second wakeword-detection component are located on a first device; and the instructions that cause the system to cause speech processing to be performed using the audio data comprise instructions that, when executed by the at least one processor, cause the system to send the audio data to a speech processing component on a second device. 14. The system of claim 11 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine audio feature data corresponding to the audio data, the audio feature data representing at least one audio feature of the audio data; and process, using a classifier, the audio feature data to determine that the speech includes the wakeword. 15. The system of claim 14 , wherein the first wakeword-detection component comprises the classifier. 16. The system of claim 11 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine, using a voice-activity detection (VAD) component, that the audio data represents the speech; based at least in part on determining that the audio data represents the speech, activate the first wakeword-detection component; and based at least in part on determining, using the first wakeword-detection component, that the speech includes the wakeword, activate the second wakeword-detection component. 17. The system of claim 11 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: based at least in part on the language data, select first data to be used by the second wakeword-detection component. 18. The system of claim 11 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: send an indication of the first spoken language to a speech processing component. 19. The system of claim 11 , wherein the instructions that cause the system to determine the language data comprise instructions that, when executed by the at least one processor, cause the system to: determine audio feature data corresponding to the audio data, the audio feature data representing at least one audio feature of the audio data; and process, using a classifier, the audio feature data to determine the language data. 20. The system of claim 11 , wherein the instructions that cause the system to use the second wakeword-detection component to determine that the speech includes the wakeword comprise instructions that, when executed by the at least one processor, cause the system to: determine, using an acoustic model and the aud

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Language recognition · CPC title

  • G10L15/08Primary

    Speech classification or search · CPC title

  • Hidden Markov Models [HMMs] · CPC title

  • using artificial neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11996097B2 cover?
A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then p…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 28 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).