What technology area does this patent fall under?

Primary CPC classification G06F3/167. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Aug 18 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning intended user actions

US2016239259A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016239259-A1
Application number	US-201514748296-A
Country	US
Kind code	A1
Filing date	Jun 24, 2015
Priority date	Feb 16, 2015
Publication date	Aug 18, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system are provided. The method includes receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances. The method further includes parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts. The method also includes recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances. The recognizing step includes comparing the verb parts and the noun parts from the user utterances individually and as pairs to the verb parts and the noun parts of the sample utterances. The method additionally includes selectively performing a given one of the user commands responsive to a recognition result.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances; parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts; recognizing, by a hardware-based recognizer, the user utterances and the associated user gestures based on the sample utterances and descriptions of associated supporting gestures for the sample utterances, wherein said recognizing step comprises comparing the verb parts and the noun parts from the user utterances individually and as pairs to the verb parts and the noun parts of the sample utterances; and selectively performing a given one of the user commands responsive to a recognition result. 2 . The method of claim 1 , wherein said recognizing step comprises forming triples of a verb, a noun, and a gesture from the user utterances of the user commands and the associated user gestures for the user utterances. 3 . The method of claim 2 , wherein said recognizing step comprises: at least one of, comparing at least one of the verb and the noun in a triple to at least one of a verb and a noun from one or more of the sample utterances, and comparing at least one synonym of at least one of the verb and the noun from the one or more of the sample utterances; and determining whether the gesture in the triple fits a description of a corresponding one or more of the associated supporting gestures. 4 . The method of claim 2 , wherein said recognizing step compares the verb and the noun to the gesture as a pair and individually. 5 . The method of claim 4 , wherein the given one of the user commands is selectively performed in an absence of one of the verb or the noun corresponding thereto, responsive to a match between an existing one of the verb or the noun and a lack of contrary intent evidence that the existing one of the verb or the noun is unrelated to the gesture. 6 . The method of claim 1 , further comprising: learning from multiple recognition sessions by acquiring user accepted examples and user rejected examples of the user utterances and the associated user gestures; and selectively performing a given one of the user commands responsive to the user accepted examples and the user rejected examples. 7 . The method of claim 6 , further comprising generating respective confidence values for at least one of the noun, the verb, the gesture, and a combination thereof including at least the gesture, responsive to at least one of a number of user accepted examples and a number of user rejected examples involving the gesture and at least one of the noun and the verb for a particular one of the user commands. 8 . The method of claim 7 , wherein said recognizing step comprises recognizing multiple possible intended actions, and the method further comprises arbitrating between the possible intended actions based on the respective confidence values corresponding thereto. 9 . The method of claim 6 , further comprising generating respective error values for at least one of the noun, the verb, the gesture, and a combination thereof including at least the gesture, responsive to at least one of a number of user accepted examples and a number of user rejected examples involving the gesture and at least one of the noun and the verb for a particular one of the user commands. 10 . The method of claim 6 , wherein said learning step comprises acquiring at least one of user spoken words and user performed gestures potentially applicable to one or more of the user commands, for storing in a memory device as at least one of new sample utterances and new descriptions of associated sample gestures for the new sample utterances. 11 . The method of claim 6 , wherein said learning step: acquires a user accepted example of at least one particular user utterance and at least one particular associated user gesture responsive to the user allowing a particular one of the user commands, represented by the at least one particular user utterance and the at least one particular associated user gesture, to be ultimately performed; and acquires a user rejected example of the at least one particular user utterance and the at least one particular associated user gesture responsive to the user preventing or undoing the particular one of the user commands represented by the at least one particular user utterance and the at least one particular associated user gesture. 12 . The method of claim 6 , wherein said learning step comprises generating statistical data to inform subsequent trials based on whether the user allows the given one of the user commands to proceed or intends to undo the given one of the user commands. 13 . The method of claim 6 , wherein said learning step comprises learning one or more ways in which the user expresses an intention to perform a particular one of the user commands using a combination of user gestures and deixis. 14 . The method of claim 1 , wherein the user commands comprise a command for moving content from a first location to a second location in a virtual environment.

Assignees

Inventors

Classifications

G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G06F3/167Primary
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
G10L15/24
Speech recognition using non-acoustical features · CPC title
G06F3/017
Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title
G10L15/1822
Parsing for meaning understanding · CPC title

Patent family

Related publications grouped by family.

View patent family 56621154

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016239259A1 cover?: A method and system are provided. The method includes receiving, by a microphone and camera, user utterances indicative of user commands and associated user gestures for the user utterances. The method further includes parsing, by a hardware-based recognizer, sample utterances and the user utterances into verb parts and noun parts. The method also includes recognizing, by a hardware-based recog…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F3/167. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Aug 18 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Digital assistant voice input integration

Speech recognition candidate selection based on non-acoustic input

Personalized content tagging

Frequently asked questions