Server side hotwording
US-2024412734-A1 · Dec 12, 2024 · US
US9865264B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9865264-B2 |
| Application number | US-201615154944-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 14, 2016 |
| Priority date | Mar 15, 2013 |
| Publication date | Jan 9, 2018 |
| Grant date | Jan 9, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are computer-implemented methods and systems for dynamic selection of speech recognition systems for the use in Chat Information Systems (CIS) based on multiple criteria and context of human-machine interaction. Specifically, once a first user audio input is received, it is analyzed so as to locate specific triggers, determine the context of the interaction or predict the subsequent user audio inputs. Based on at least one of these criteria, one of a free-diction recognizer, pattern-based recognizer, address book based recognizer or dynamically created recognizer is selected for recognizing the subsequent user audio input. The methods described herein increase the accuracy of automatic recognition of user voice commands, thereby enhancing overall user experience of using CIS, chat agents and similar digital personal assistant systems.
Opening claim text (preview).
The invention claimed is: 1. A method for speech recognition in a Chat Information System (CIS), the method comprising: receiving, by a processor operatively coupled with a memory, a first audio input, the first audio input captured by a microphone of a device of a user; recognizing, by a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, the processor being configured to select an output from the plurality of outputs based on the confidence levels; providing, by the processor, a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determining, by the processor, a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receiving, by the processor, a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, selecting, by the processor, a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognizing, by the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and providing, by the processor, a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device. 2. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a free-dictation recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a free speech of the user. 3. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a pattern-based recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a pattern-based speech of the user. 4. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, an address book based recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a name or nickname from a digital address book. 5. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a dynamically created recognizer, when the response type predicts that the type of the input of the user that will follow the response includes an item from a list storing items of the same type. 6. A Chat Information System (CIS), the CIS comprising: a machine-readable medium storing instructions; one or more hardware processors executing the stored instructions to: receive a first audio input, the first audio input captured by a microphone; recognize, using a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, one or more of the processors being configured to select an output from the plurality of outputs based on the confidence levels; provide a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determine a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receive a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, select a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognize, using the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and provide a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device.
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
to the speaker · CPC title
of application context · CPC title
Feature extraction for speech recognition; Selection of recognition unit · CPC title
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.