Behavior adjustment using speech recognition system

US2016155437A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016155437-A1
Application numberUS-201414557751-A
CountryUS
Kind codeA1
Filing dateDec 2, 2014
Priority dateDec 2, 2014
Publication dateJun 2, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic representation of a term spoken by the child, and can determine whether the phonetic representation matches a particular canonical pronunciation of the particular term that is deemed age-appropriate for the child. Upon determining that the particular canonical pronunciation that matches the phonetic representation of the term spoken by the child is not age-appropriate, the speech recognition system can select and implement a variety of remediation strategies for inducing the child to repeat the term using a pronunciation that is considered age-appropriate.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: receiving audio data corresponding to a user speaking a particular term; generating a phonetic representation of the particular term based on the audio data; determining that the phonetic representation matches a particular canonical pronunciation of a particular term, wherein the particular canonical pronunciation is associated with an indication of age-appropriateness; obtaining data that indicates an age of the user; determining, based on a comparison of (i) the data that indicates the age of the user and (ii) indication of age-appropriateness that is associated with the particular canonical pronunciation of the particular term, that the pronunciation of the particular term by the user is not age-appropriate; based on determining that the pronunciation of the particular term by the user is not age appropriate, selecting a remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation. 2 . The computer-implemented method of claim 1 , comprising: selecting, from among a plurality of canonical pronunciations stored in a phonetic dictionary, the particular canonical pronunciation as a best match of the phonetic representation generated of the particular term. 3 . The computer-implemented method of claim 2 , comprising: storing, in the phonetic dictionary, a plurality of canonical pronunciations associated with the particular term, wherein the plurality of canonical pronunciations includes the particular canonical pronunciation selected for the particular term, and wherein two or more of the plurality of canonical pronunciations include an indication of age-appropriateness. 4 . The computer-implemented method of claim 1 , wherein the indication of age-appropriateness comprises a maximum age, and wherein, determining that the pronunciation of the particular term by the user is not age-appropriate comprises determining that the age of the user is greater than the maximum age. 5 . The computer-implemented method of claim 1 , wherein the remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation involves prompting the user to speak the particular term again. 6 . The computer-implemented method of claim 1 , wherein the remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation involves outputting audio data corresponding to a pronunciation of the particular term that not age-appropriate. 7 . The computer-implemented method of claim 6 , wherein outputting audio data corresponding to a pronunciation of the particular term that is not age-appropriate comprises outputting the received audio data corresponding to the user speaking the particular term. 8 . The computer-implemented method of claim 6 , wherein outputting audio data corresponding to a pronunciation of the particular term that is not age-appropriate comprises generating a text-to-speech output using the particular canonical representation that matches the phonetic representation. 9 . The computer-implemented method of claim 1 , wherein the remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation involves (i) selecting another canonical pronunciation of the particular term that is determined to be age-appropriate, and (ii) outputting audio data corresponding to the selected other canonical pronunciation. 10 . The computer-implemented method of claim 1 , wherein the remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation involves initiating an action associated with the particular term despite the determination that the pronunciation of the particular term by the user is not age-appropriate. 11 . The computer-implemented method of claim 1 , comprising: before selecting a remediation strategy, obtaining biometric data associated with the user; and determining that the biometric data satisfies a predetermined emotional threshold, wherein the remediation strategy is selected based on determining that the biometric data satisfies the predetermined emotional threshold. 12 . The computer-implemented method of claim 1 , wherein the remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation involves (i) detecting another person within a predetermined distance of the user, and (ii) sending a message to the other person indicating that the pronunciation of the particular term by the user is not age-appropriate. 13 . The computer-implemented method of claim 1 , comprising: after selecting the remediation strategy, receiving, additional audio data corresponding to the user speaking the particular term again; generating a phonetic representation of the particular term based on the additional audio data; determining that the phonetic representation of the particular term in the additional audio data matches an age-appropriate canonical pronunciation of the particular term; and based on determining the phonetic representation of the particular term in the additional audio data matches an age-appropriate canonical pronunciation of the particular term, initiating an action associated with the particular term. 14 . A system, comprising: one or more computers programmed to perform operations comprising: receiving audio data corresponding to a user speaking a particular term; generating a phonetic representation of the particular term based on the audio data; determining that the phonetic representation matches a particular canonical pronunciation of a particular term, wherein the particular canonical pronunciation is associated with an indication of age-appropriateness; obtaining data that indicates an age of the user; determining, based on a comparison of (i) the data that indicates the age of the user and (ii) indication of age-appropriateness that is associated with the particular canonical pronunciation of the particular term, that the pronunciation of the particular term by the user is not age-appropriate; based on determining that the pronunciation of the particular term by the user is not age appropriate, selecting a remediation strategy for inducing the user to speak the particular term using an age-appropriate pronunciation. 15 . The system of claim 14 , wherein the operations further comprise selecting from among a plurality of canonical pronunciations stored in a phonetic dictionary, the particular canonical pronunciation as a best match of the phonetic representation generated of the particular term. 16 . The system of claim 14 , wherein the operations further comprise storing in the phonetic dictionary, a plurality of canonical pronunciations associated with the particular term, wherein the plurality of canonical pronunciations includes the particular canonical pronunciation selected for the particular term, and wherein two or more of the plurality of canonical pronunciations include an indication of age-appropriateness. 17 . The system of claim 14 , wherein the indication of age-appropriateness comprises a maximum age, and wherein determining that the pronunciation of the particular term by the user is not age-appropriate comprises determining that the age of the user is greater than the maximum age. 18 . A computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to

Assignees

Inventors

Classifications

  • G10L15/02Primary

    Feature extraction for speech recognition; Selection of recognition unit · CPC title

  • Phonemes, fenemes or fenones being the recognition units · CPC title

  • for comparison or discrimination · CPC title

  • Speaking (with audible presentation of the material to be studied G09B5/04) · CPC title

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016155437A1 cover?
Methods, systems, and apparatus are described for inducing a user of a speech recognition system to adjust their own behavior. For example, in one implementation, a speech recognition system that allows children to control electronic devices can improve the child's speech development, by encouraging the child to speak more clearly. To do so, the speech recognition system can generate a phonetic…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/02. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).