Voice command execution from auxiliary input
US-2020202849-A1 · Jun 25, 2020 · US
US11567633B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11567633-B2 |
| Application number | US-202117170696-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 8, 2021 |
| Priority date | Feb 8, 2021 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for determining focus of a user is provided. User input is received. An intention image of a scene including a plurality of interactive objects is generated. The intention image includes pixels encoded with intention values determined based on the user input. An intention value indicates a likelihood that the user intends to focus on the pixel. An intention score is determined for each interactive object based on the intention values of pixels that correspond to the interactive object. An interactive object of the plurality of interactive objects is determined to be a focused object that has the user's focus based on the intention scores of the plurality of interactive objects.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method for determining focus of a user, the method comprising: visually presenting, via a display, a presentation image of a scene including a plurality of interactive objects; receiving user input from a plurality of user input modalities while the presentation image is being visually presented, via the display; generating an intention image of the scene including the plurality of interactive objects, the intention image including pixels that are encoded with intention values determined from the user input received from the plurality of user input modalities based on intention attributes of the plurality of interactive objects, wherein intention attributes for an interactive object define rules for calculating the intention values of pixels corresponding to the interactive object based on the user input from the plurality of user input modalities, wherein different interaction objects have different intention attributes that define different rules for how an intention value is calculated based on the user input, such that different rules produce different intention values for a same pixel based on the user input, and wherein an intention value indicates a likelihood that the user intends to focus on the pixel while the presentation image is being visually presented via the display; determining an intention score for each interactive object based the intention values of pixels that correspond to the interactive object; determining that an interactive object of the plurality of interactive objects is a focused object that has the user's focus based on the intention scores of the plurality of interactive objects; based on determining that the interactive object is the focused object, visually presenting, via the display, an updated presentation image changed relative to the presentation image to indicate that the interactive object is the focused object. 2. The computer-implemented method of claim 1 , wherein the plurality of user input modalities includes two or more of eye position and rotation; left-hand position and rotation; right-hand position and rotation; voice input; a position of a mouse cursor; a position of one or more touch points on a touch screen; a three degree of freedom position of a motion controller; and a six degree of freedom position and orientation of a motion controller. 3. The computer-implemented method of claim 1 , further comprising normalizing the intention scores for the plurality of interactive objects according to object size. 4. The computer-implemented method of claim 1 , further comprising smoothing the intention scores for the plurality of interactive objects based on a plurality of determined instances of the intention scores from a plurality of intention images. 5. The computer-implemented method of claim 1 , wherein the intention score of each interactive object is determined by summing the intention values of pixels that correspond to the interactive object, and wherein an interactive object having a highest intention score of the intention scores of the plurality of interactive objects is determined to be the focused object that has the user's focus. 6. The computer-implemented method of claim 1 , wherein a visual appearance of the interactive object is changed in the updated presentation image relative to the presentation image to indicate that the interactive object is the focused object. 7. The computer-implemented method of claim 1 , further comprising, determining an ambiguity of focus between two or more interactive objects based on intention scores of at least the two or more interactive objects, and based on said determining, visually presenting, via the display, a disambiguation prompt to determine the user's intended focus target of the two or more interactive objects. 8. The computer-implemented method of claim 1 , wherein the intention values of the pixels of the intention image are determined further based on contextual information including prior user interaction with an interactive object. 9. A computing system comprising: a logic processor; and a storage device holding instructions executable by the logic processor to: visually present, via a display, a presentation image of a scene including a plurality of interactive objects; receive user input from a plurality of user input modalities via user input componentry while the presentation image is being visually presented via the display; generate an intention image of the scene including the plurality of interactive objects, the intention image including pixels that are encoded with intention values determined from the user input received from the plurality of user input modalities based on intention attributes of the plurality of interactive objects, wherein intention attributes for an interactive object define rules for calculating the intention values of pixels corresponding to the interactive object based on the user input from the plurality of user input modalities, wherein different interaction objects have different intention attributes that define different rules for how an intention value is calculated based on the user input, such that different rules produce different intention values for a same pixel based on the user input, and wherein an intention value indicates a likelihood that the user intends to focus on the pixel while the presentation image is being visually presented, via the display; determine an intention score for each interactive object based on a sum of intention values of pixels that correspond to the interactive object; determine that an interactive object of the plurality of interactive objects is a focused object that has the user's focus based on the intention scores of the plurality of interactive objects; and based on determining that the interactive object is the focused object, visually present, via the display, an updated presentation image changed relative to the presentation image to indicate that the interactive object is the focused object. 10. The computing system of claim 9 , further comprising: a plurality of intention shaders each associated with a different interactive object of the plurality of interactive objects, each of the plurality of intention shader being configured to determine intention values of pixels that correspond to an associated interactive object using intention attributes for the interactive object. 11. The computing system of claim 10 , wherein the intention image includes a plurality of channels, and wherein each of the plurality of intention shaders is configured to, for a pixel of the intention image, 1) encode the intention value into a first channel, and 2) encode an interactive object identifier of an interactive object to which the pixel corresponds into a second channel. 12. The computing system of claim 9 , wherein the plurality of user input modalities includes two or more of eye position and rotation; left-hand position and rotation; right-hand position and rotation; voice input; a position of a mouse cursor; a position of one or more touch points on a touch screen; a three degree of freedom position of a motion controller; and a six degree of freedom position and orientation of a motion controller. 13. The computing system of claim 9 , wherein the storage device further holds instructions executable by the logic processor to normalize the intention scores for the plurality of interactive objects according to object size. 14. The computing system of claim 9 , wherein the storage device further holds instructions executable by the logic processor to smooth the intention scores for the plurality of inte
Interaction with lists of selectable items, e.g. menus · CPC title
Interaction with a metaphor-based environment or interaction object displayed as three-dimensional [3D], e.g. changing the user viewpoint with respect to the environment or object · CPC title
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title
Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.