Household agent learning

US9786281B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9786281-B1
Application numberUS-201213565725-A
CountryUS
Kind codeB1
Filing dateAug 2, 2012
Priority dateAug 2, 2012
Publication dateOct 10, 2017
Grant dateOct 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A user profile for a plurality of users may be built for speech recognition purposes and for acting as an agent of the user. In some embodiments, a speech processing device automatically receives an utterance from a user. The utterance may be analyzed using signal processing to identify data associated with the user. The utterance may also be analyzed using speech recognition to identify additional data associated with the user. The identified data may be stored in a profile of the user. Data in the user profile may be used to select an acoustic model and/or a language model for speech recognition or to take actions on behalf of the user.

First claim

Opening claim text (preview).

The invention claimed is: 1. A device comprising: a profile building component in communication with an electronic data store; a speech recognition component; and a sensor configured to detect movement of a user independent of a direction of the user's gaze and without detecting physical contact between the user and the device; wherein the profile building component is configured to: receive, from the sensor, an indication that presence of the user was detected; begin listening for utterances from the user in response to receiving the indication; detect a first voice signal corresponding to a first utterance of the user; determine an identity of the user using the first voice signal; process the first voice signal to determine acoustic information about the user, wherein the acoustic information comprises at least one of an age, a gender, an accent type, a native language, or a type of speech pattern of the user; perform speech recognition on the first voice signal to obtain a transcript; process the transcript to determine language information relating to the user, wherein the language information comprises at least one of a name, hobbies, habits, or preferences of the user; store, in a user profile associated with the identity of the user, the acoustic information and the language information; determine acoustic model information using at least one of the first voice signal, the acoustic information, or the language information; and determine language model information using at least one of the transcript, the acoustic information, or the language information; and wherein the speech recognition component is configured to: receive a second voice signal corresponding to a second utterance of the user; determine the identity of the user using the second voice signal; perform speech recognition on the second voice signal using at least one of the acoustic model information or the language model information to obtain a word sequence that indicates that a third utterance corresponding to a language characteristic will be uttered by a second user different than the user at a time after a current time; and select a second user acoustic model corresponding to the language characteristic for performing speech recognition at the time after the current time. 2. The device of claim 1 , wherein speech recognition component is further configured to: determine event information from the speech recognition results, wherein the event information includes date information; and provide a reminder to the user concerning the event using the date information. 3. The device of claim 1 , wherein the profile building component is further configured to: receive a third voice signal corresponding to a fourth utterance of a second user; determine an identity of the second user using the third voice signal; process the third voice signal to determine second acoustic information and second language information; and store, in a second user profile associated with the identity of the second user, the second acoustic information and the second language information. 4. A device comprising: a profile building component in communication with an electronic data store; a sensor configured to detect presence of a user independent of a direction of the user's gaze and without detecting physical contact between the user and the device; and a speech recognition component; wherein the profile building component is configured to: receive, from the sensor, an indication that presence of the user was detected; begin to listen for utterances from the user in response to receiving the indication; receive a first voice signal corresponding to a first utterance of a user; determine an identity of the user using the first voice signal; process the first voice signal to determine user information and a word sequence that indicates that a second utterance corresponding to a language characteristic is likely to be uttered by a second user different than the user at a time after a current time; store the user information in a user profile associated with the identity of the user; and select a second user acoustic model corresponding to the language characteristic for performing speech recognition. 5. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is further configured to process the first voice signal to determine user information by performing signal processing on the first voice signal to determine acoustic information. 6. The device of claim 5 , wherein the acoustic information comprises at least one of an age, a gender, an accent type, a native language, or a type of speech pattern of the user. 7. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is configured to process the first voice signal to determine user information by performing speech recognition on the first voice signal and obtain the user information from the speech recognition results. 8. The device of claim 7 , wherein the user information comprises at least one of a profession, hobbies, habits, preferences, a temporary condition, health, a schedule, an agenda, an itinerary, appointments, tastes, or plans of the user. 9. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is further configured to: receive a second voice signal corresponding to a third utterance of a second user; determine an identity of the second user using the second voice signal; process the second voice signal to determine second user information; and store the second user information in a second user profile associated with the identity of the second user. 10. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is further configured to: perform speech recognition on the first voice signal to obtain speech recognition results; determine acoustic model information using at least one of the first voice signal and the user information; and determine language model information using at least one of the speech recognition results and the user information. 11. The device of claim 10 , wherein at least one of the profile building component or the speech recognition component is further configured to: receive a second voice signal corresponding to a third utterance from the user; determine the identity of the user using the second voice signal; and perform speech recognition on the second voice signal using the acoustic model information and the language model information to obtain second speech recognition results. 12. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is further configured to: perform speech recognition on the first voice signal to obtain speech recognition results; take an action using the speech recognition results. 13. The device of claim 4 , wherein at least one of the profile building component or the speech recognition component is further configured to: perform speech recognition on the first voice signal to obtain speech recognition results; determine event information from the speech recognition results, wherein the event information includes date information; and provide a reminder to the user concerning the event using the date information. 14. A non-transitory computer-readable medium comprising one or more modules configured to execute in one or more processors of a computing device, the one or more modules being further configured

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Speaker identification or verification techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9786281B1 cover?
A user profile for a plurality of users may be built for speech recognition purposes and for acting as an agent of the user. In some embodiments, a speech processing device automatically receives an utterance from a user. The utterance may be analyzed using signal processing to identify data associated with the user. The utterance may also be analyzed using speech recognition to identify additi…
Who is the assignee on this patent?
Adams Jeffrey P, Salvador Stan W, Kneser Reinhard, and 1 more
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).