Technologies for improved keyword spotting

US10217458B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10217458-B2
Application numberUS-201615274498-A
CountryUS
Kind codeB2
Filing dateSep 23, 2016
Priority dateSep 23, 2016
Publication dateFeb 26, 2019
Grant dateFeb 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies for improved keyword spotting are disclosed. A compute device may capture speech data from a user of the compute device, and perform automatic speech recognition on the captured speech data. The automatic speech recognition algorithm is configured to both spot keywords as well as provide a full transcription of the captured speech data. The automatic speech recognition algorithm may preferentially match the keywords compared to similar words. The recognized keywords may be used to improve parsing of the transcribed speech data or to improve an assistive agent in holding a dialog with a user of the compute device.

First claim

Opening claim text (preview).

The invention claimed is: 1. A compute device for automatic speech recognition, the compute device comprising: an automatic speech recognition algorithm trainer to acquire a statistical language model for an automatic speech recognition algorithm, wherein the statistical language model comprises a large-vocabulary language model that has been modified such that the large-vocabulary language model preferentially matches words present in a plurality of keywords; a speech data capturer to receive speech data of a user of the compute device; and an automatic speech recognizer to perform the automatic speech recognition algorithm on the speech data to produce an output transcript, wherein the output transcript comprises one or more keywords of the plurality of keywords and one or more words not in the plurality of keywords. 2. The compute device of claim 1 , wherein the large-vocabulary language model that has been modified to preferentially match words present in the plurality of keywords comprises a first hidden Markov model to match words present in a large vocabulary and a second hidden Markov model to match words present in the plurality of keywords. 3. The compute device of claim 2 , wherein weightings of the plurality of keywords are higher than corresponding weightings of the rest of the statistical language model such that the statistical language model preferentially matches the plurality of keywords. 4. The compute device of claim 2 , wherein the statistical language model is formed by a linear interpolation of the large-vocabulary language model and a keyword language model. 5. The compute device of claim 1 , further comprising a speech parser to: identify, based on the one or more keywords, a context of a portion of the output transcript; and parse the output transcript based on the context of the portion of the output transcript. 6. The compute device of claim 1 , wherein to acquire the statistical language model for the automatic speech recognition algorithm comprises to train a statistical language model for a large vocabulary and augment the statistical language model with a keyword language model such that the statistical language model preferentially matches the plurality of keywords. 7. The compute device of claim 6 , wherein the statistical language model has been trained using domain-specific training data. 8. The compute device of claim 1 , further comprising an assistive agent to update a belief state of the assistive agent in response to a match of the one or more keywords. 9. The compute device of claim 8 , wherein to update the belief state in response to matching the one or more keywords comprises to search a word lattice of the automatic speech recognition algorithm and to find a better match of the word lattice to the speech data based on the one or more keywords. 10. The compute device of claim 1 , wherein at least one of the keywords of the plurality of keywords is a keyphrase comprising two or more words. 11. A method for automatic speech recognition by a compute device, the method comprising: acquiring, by the compute device, a statistical language model for an automatic speech recognition algorithm, wherein the statistical language model comprises a large-vocabulary language model that has been modified such that the large-vocabulary language model preferentially matches words present in a plurality of keywords; receiving, by the compute device, speech data of a user of the compute device; and performing, by the compute device, the automatic speech recognition algorithm on the speech data to produce an output transcript, wherein the output transcript comprises one or more keywords of the plurality of keywords and one or more words not in the plurality of keywords. 12. The method of claim 11 , wherein the large-vocabulary language model that has been modified to preferentially match words present in the plurality of keywords comprises a first hidden Markov model to match words present in a large vocabulary and a second hidden Markov model to match words present in the plurality of keywords. 13. The method of claim 12 , wherein weightings of the plurality of keywords are higher than corresponding weightings of the rest of the statistical language model such that the statistical language model preferentially matches the plurality of keywords. 14. The method of claim 12 , wherein the statistical language model is formed by a linear interpolation of the large-vocabulary language model and a keyword language model. 15. The method of claim 11 , wherein acquiring the statistical language model for the automatic speech recognition algorithm comprises training a statistical language model for a large vocabulary and augmenting the statistical language model with a keyword language model such that the statistical language model preferentially matches the plurality of keywords. 16. The method of claim 11 , further comprising updating, by an assistive agent of the compute device, a belief state of the assistive agent in response to matching the one or more keywords. 17. The method of claim 16 , wherein updating, by the assistive agent, the belief state in response to matching the one or more keywords comprises searching a word lattice of the automatic speech recognition algorithm and finding a better match of the word lattice to the speech data based on the one or more keywords. 18. One or more non-transitory, computer-readable media comprising a plurality of instructions thereon that, when executed, causes a compute device to: acquire a statistical language model for an automatic speech recognition algorithm, wherein the statistical language model comprises a large-vocabulary language model that has been modified such that the large-vocabulary language model preferentially matches words present in a plurality of keywords; receive speech data of a user of the compute device; and perform the automatic speech recognition algorithm on the speech data to produce an output transcript, wherein the output transcript comprises one or more keywords of the plurality of keywords and one or more words not in the plurality of keywords. 19. The one or more non-transitory, computer-readable media of claim 18 , wherein the large-vocabulary language model that has been modified to preferentially match words present in the plurality of keywords comprises a first hidden Markov model to match words present in a large vocabulary and a second hidden Markov model to match words present in the plurality of keywords. 20. The one or more non-transitory, computer-readable media of claim 19 , wherein weightings of the plurality of keywords are higher than corresponding weightings of the rest of the statistical language model such that the statistical language model preferentially matches the plurality of keywords. 21. The one or more non-transitory, computer-readable media of claim 19 , wherein the statistical language model is formed by a linear interpolation of the large-vocabulary language model and a keyword language model. 22. The one or more non-transitory, computer-readable media of claim 18 , wherein the plurality of instructions further causes the compute device to: identify, based on the one or more keywords, a context of a portion of the output transcript; and parse the output transcript based on the context of the portion of the output transcript. 23. The one or more non-transitory, computer-readable media of claim 18 , wherein to acquire the statistica

Assignees

Inventors

Classifications

  • Training of HMMs · CPC title

  • of application context · CPC title

  • G10L15/142Primary

    Hidden Markov Models [HMMs] · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Parsing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10217458B2 cover?
Technologies for improved keyword spotting are disclosed. A compute device may capture speech data from a user of the compute device, and perform automatic speech recognition on the captured speech data. The automatic speech recognition algorithm is configured to both spot keywords as well as provide a full transcription of the captured speech data. The automatic speech recognition algorithm ma…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G10L15/142. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).