Pre-wakeword speech processing with reverse automatic speech recognition

US12424215B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12424215-B2
Application numberUS-202217804544-A
CountryUS
Kind codeB2
Filing dateMay 27, 2022
Priority dateMay 27, 2022
Publication dateSep 23, 2025
Grant dateSep 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for pre-wakeword speech processing are disclosed. Speech audio, comprising command speech spoken before a wakeword, may be stored in a buffer in oldest to newest order. Upon detection of the wakeword, reverse acoustic models and language models, such as reverse automatic speech recognition (R-ASR) can be applied to the buffered audio, in newest to oldest order, starting from before the wakeword. The speech is converted into a sequence of words. Natural language grammar models, such as natural language understanding (NLU), can be applied to match the sequence of words to a complete command, the complete command being associated with invoking a computer operation.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of recognizing a command spoken before a wakeword, the method comprising: receiving an audio signal comprising speech; storing the audio signal in a sequence of spectral frames in a buffer; detecting a wakeword in the audio signal; estimating an initial frame of the wakeword; converting a plurality of frames into a sequence of words using reverse automatic speech recognition (R-ASR), in newest to oldest order, beginning from a first frame before the initial frame of the wakeword; matching the sequence of words to a complete command; identifying a mid-sentence correction between the command and the wakeword, the mid-sentence correction corresponding to words matching a second complete command; and; invoking a function associated with the second complete command. 2. The computer-implemented method of claim 1 , wherein using R-ASR further comprises referencing at least a phonetic dictionary, the phonetic dictionary having reverse pronunciations of words. 3. The computer-implemented method of claim 1 , wherein using R-ASR further comprises referencing at least a language model, the language model having reverse orders of word sequences. 4. The computer-implemented method of claim 1 , further comprising: transforming the audio signal into a sequence of reversed phonemes through R-ASR. 5. The computer-implemented method of claim 1 , further comprising: terminating R-ASR when the sequence of words matches the complete command. 6. The computer-implemented method of claim 1 , further comprising: estimating a last frame of the wakeword; converting a second plurality of frames into a second sequence of words using automatic speech recognition (ASR) system, in oldest to newest order, beginning from a first frame after the last frame of the wakeword; combining the sequence of words and the second sequence of words into a combined sequence of words; matching the combined sequence of words to the complete command; and invoking the function associated with the complete command. 7. The computer-implemented method of claim 6 , wherein using ASR further comprises referencing at least a second phonetic dictionary, the second phonetic dictionary having forward pronunciations of words. 8. The computer-implemented method of claim 6 , wherein using ASR further comprises referencing at least a second language model, the second language model having forward orders of word sequences. 9. The computer-implemented method of claim 6 , further comprising: converting the plurality of frames using R-ASR and converting the second plurality of frames using ASR in separate simultaneous threads. 10. The computer-implemented method of claim 6 , further comprising: converting the plurality of frames using R-ASR on a high-performance processor; and converting the second plurality of frames using ASR on a low-performance processor. 11. The computer-implemented method of claim 1 , further comprising: detecting a pause in the audio signal; and converting the plurality of frames using R-ASR, in newest to oldest order, beginning from the first frame before the initial frame of the wakeword toward the pause. 12. The computer-implemented method of claim 1 , wherein the wakeword is a high frequency phrase. 13. A computer-implemented method of recognizing a command, the method comprising: receiving an audio signal comprising speech; detecting a wakeword in the audio signal; estimating a beginning time of the wakeword; converting the audio signal into a sequence of words using reverse automatic speech recognition (R-ASR), in newest to oldest order, from before the beginning time of the wakeword; matching the sequence of words to a complete command; identifying a mid-sentence correction between the command and the wakeword, the mid-sentence correction corresponding to words matching a second complete command; and; invoking a function associated with the second complete command. 14. The computer-implemented method of claim 13 , wherein using R-ASR further comprises referencing at least a phonetic dictionary, the phonetic dictionary having reverse pronunciations of words. 15. The computer-implemented method of claim 13 , wherein using R-ASR further comprises referencing at least a language model, the language model having reverse orders of word sequences. 16. The computer-implemented method of claim 13 , further comprising: transforming the audio signal into a sequence of reversed phonemes through R-ASR. 17. The computer-implemented method of claim 13 , further comprising: terminating R-ASR when the sequence of words matches the complete command. 18. The computer-implemented method of claim 13 , further comprising: estimating a last frame of the wakeword; converting a second plurality of frames into a second sequence of words using automatic speech recognition (ASR) system, in oldest to newest order, beginning from a first frame after the last frame of the wakeword; combining the sequence of words and the second sequence of words into a combined sequence of words; matching the combined sequence of words to the complete command; and invoking the function associated with the complete command. 19. The computer-implemented method of claim 13 , wherein using ASR further comprises referencing at least a second phonetic dictionary, the second phonetic dictionary having forward pronunciations of words. 20. The computer-implemented method of claim 13 , wherein using ASR further comprises referencing at least a second language model, the second language model having forward orders of word sequences.

Assignees

Inventors

Classifications

  • G10L15/08Primary

    Speech classification or search · CPC title

  • Discriminating between voiced and unvoiced parts of speech signals (G10L25/90 takes precedence) · CPC title

  • Word spotting · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

  • using context dependencies, e.g. language models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12424215B2 cover?
Methods and systems for pre-wakeword speech processing are disclosed. Speech audio, comprising command speech spoken before a wakeword, may be stored in a buffer in oldest to newest order. Upon detection of the wakeword, reverse acoustic models and language models, such as reverse automatic speech recognition (R-ASR) can be applied to the buffered audio, in newest to oldest order, starting from…
Who is the assignee on this patent?
Soundhound Inc, Soundhound Ai Ip Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).