Selective speech recognition for chat and digital personal assistant systems

US9865264B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9865264-B2
Application numberUS-201615154944-A
CountryUS
Kind codeB2
Filing dateMay 14, 2016
Priority dateMar 15, 2013
Publication dateJan 9, 2018
Grant dateJan 9, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are computer-implemented methods and systems for dynamic selection of speech recognition systems for the use in Chat Information Systems (CIS) based on multiple criteria and context of human-machine interaction. Specifically, once a first user audio input is received, it is analyzed so as to locate specific triggers, determine the context of the interaction or predict the subsequent user audio inputs. Based on at least one of these criteria, one of a free-diction recognizer, pattern-based recognizer, address book based recognizer or dynamically created recognizer is selected for recognizing the subsequent user audio input. The methods described herein increase the accuracy of automatic recognition of user voice commands, thereby enhancing overall user experience of using CIS, chat agents and similar digital personal assistant systems.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for speech recognition in a Chat Information System (CIS), the method comprising: receiving, by a processor operatively coupled with a memory, a first audio input, the first audio input captured by a microphone of a device of a user; recognizing, by a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, the processor being configured to select an output from the plurality of outputs based on the confidence levels; providing, by the processor, a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determining, by the processor, a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receiving, by the processor, a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, selecting, by the processor, a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognizing, by the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and providing, by the processor, a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device. 2. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a free-dictation recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a free speech of the user. 3. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a pattern-based recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a pattern-based speech of the user. 4. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, an address book based recognizer, when the response type predicts that the type of the input of the user that will follow the response includes a name or nickname from a digital address book. 5. The method of claim 1 , wherein the selecting of the second speech recognizer includes selecting, by the processor, a dynamically created recognizer, when the response type predicts that the type of the input of the user that will follow the response includes an item from a list storing items of the same type. 6. A Chat Information System (CIS), the CIS comprising: a machine-readable medium storing instructions; one or more hardware processors executing the stored instructions to: receive a first audio input, the first audio input captured by a microphone; recognize, using a first speech recognizer of a plurality of speech recognizers, at least a part of the first audio input to generate a first recognized input, wherein each of the plurality of speech recognizers is configured to generate a plurality of outputs provided with corresponding confidence levels, one or more of the processors being configured to select an output from the plurality of outputs based on the confidence levels; provide a response to the first recognized input utilizing the CIS, the response being provided for presentation to the user via the device of the user; determine a response type of the response provided utilizing the CIS, the response type predicting a type of input of the user that will follow the response; receive a second audio input that follows the response; based on the determined response type of the response provided utilizing the CIS, select a second speech recognizer, of the plurality of speech recognizers, for use in recognizing the second audio input that follows the response; recognize, using the second speech recognizer, at least a part of the second audio input to generate a second recognized input; and provide a second response based on the second recognized input utilizing the CIS, the second response being provided for presentation to the user via the device.

Assignees

Inventors

Classifications

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • to the speaker · CPC title

  • of application context · CPC title

  • Feature extraction for speech recognition; Selection of recognition unit · CPC title

  • G10L15/32Primary

    Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9865264B2 cover?
Disclosed are computer-implemented methods and systems for dynamic selection of speech recognition systems for the use in Chat Information Systems (CIS) based on multiple criteria and context of human-machine interaction. Specifically, once a first user audio input is received, it is analyzed so as to locate specific triggers, determine the context of the interaction or predict the subsequent u…
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/32. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).