Voice command execution from auxiliary input
US-2020202849-A1 · Jun 25, 2020 · US
US11429186B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11429186-B2 |
| Application number | US-202016951940-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 18, 2020 |
| Priority date | Nov 18, 2020 |
| Publication date | Aug 30, 2022 |
| Grant date | Aug 30, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One example provides a computing device comprising instructions executable to receive information regarding one or more entities in the scene, to receive eye tracking a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user and, based at least on the eye tracking samples, determine a time-dependent attention value for each entity of the one or more entities at different locations in a use environment, the time-dependent attention value determined using a leaky integrator. The instructions are further executable to receive a user input indicating an intent to perform a location-dependent action, associate the user input to with a selected entity based at least upon the time-dependent attention value for each entity, and perform the location-dependent action based at least upon a location of the selected entity.
Opening claim text (preview).
The invention claimed is: 1. A computing device comprising: a logic subsystem; and a storage subsystem holding instructions executable by the logic machine to: receive information regarding one or more entities in a scene, receive, via an eye-tracking sensor, a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user, based at least on the eye tracking samples, determine a time-dependent attention value for each entity of the one or more entities, each time-dependent attention value decaying over a period of time when a user looks away from a corresponding entity, receive a user input indicating an intent to perform a location-dependent action, and associate the user input with a selected entity based at least upon the time-dependent attention value for the selected entity. 2. The computing device of claim 1 , wherein the computing device comprises a head-mounted computing device comprising one or more of the eye-tracking sensor, a depth camera, and a microphone. 3. The computing device of claim 1 , wherein the instructions are executable to receive information on the one or more entities by receiving, from a depth camera, a depth image of a scene, and based on the depth image, identifying one or more entities in the scene. 4. The computing device of claim 1 , wherein the instructions are executable to determine the time-dependent attention value for each entity of the one or more entities using a leaky integrator. 5. The computing device of claim 1 , wherein instructions executable to associate the user input to the selected entity comprises instructions executable to: input, to a trained machine-learning model, the user input, the one or more entities, and the time-dependent attention values, receive, from the trained machine-learning model, a likelihood for the association of the user input with the selected entity, and associate the user input with the selected entity based at least on the likelihood. 6. The computing device of claim 1 , wherein the instructions are executable to assign a timestamp for the user input and associate the user input with the selected entity by comparing the timestamp for the user input to a timestamp for the time-dependent attention value for the selected entity. 7. The computing device of claim 6 , wherein the selected entity is a first selected entity, wherein the user input comprises a first location-dependent term and a second location-dependent term, and wherein the instructions are executable to associate the second location-dependent term with a second selected entity based upon a time-dependent attention value for the second selected entity. 8. The computing device of claim 6 , wherein the instructions are further executable to store, for each entity, a plurality of time-dependent attention values, each time-dependent attention value for the entity corresponding to a different timestamp. 9. The computing device of claim 1 , wherein the selected entity comprises a real-world object or a virtual object. 10. The computing device of claim 1 , wherein the location-dependent action comprises placing a virtual object. 11. The computing device of claim 1 , wherein the selected entity comprises a virtual object representing an application, and the location-dependent action comprises controlling the application. 12. On a computing device, a method comprising: receiving a depth image of a scene; based on the depth image, identifying one or more entities in the scene; receiving a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user; based at least on the eye tracking samples, determining time-dependent attention value for each entity of the one or more entities, the time-dependent attention values determined using a leaky integrator such that each time-dependent attention value decays over a period of time when a user looks away from a corresponding entity; receiving a user input indicating an intent to perform a location-dependent action; associating the user input with a selected entity of the one or more entities based upon the time-dependent attention value for each entity; and performing the location-dependent action based at least upon the selected entity. 13. The method of claim 12 , wherein the user input comprises one or more of a gesture input and a speech input. 14. The method of claim 12 , wherein associating the user input to the selected entity comprises: inputting, to a trained machine-learning model, the user input, the one or more entities, and the time-dependent attention values, receiving, from the trained machine-learning model, a likelihood for the association of the user input with the selected entity, and associating the user input with the selected entity based at least on the likelihood. 15. The method of claim 12 , wherein the selected entity comprises one or more of a real-world object and a virtual object. 16. The method of claim 12 , wherein the selected entity comprises a virtual object, and wherein the location-dependent action comprises one or more of moving the virtual object and controlling an application represented by the virtual object. 17. The method of claim 12 , wherein the depth image is received from a depth camera remote to the computing system. 18. The method of claim 12 , wherein associating the user input to the selected entity comprises comparing time-dependent attention values for the one or more entities to a timestamp of the user input. 19. A computing device comprising: a logic machine; and a storage subsystem holding instructions executable by the logic machine to receive a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user, based at least on the eye tracking samples, determine time-dependent attention values for each entity of one or more entities at different locations in a use environment, the time-dependent attention values determined using a leaky integrator such that each time-dependent attention value decays over a period of time when a user looks away from a corresponding entity, determine a selected entity of the one or more entities based upon the time-dependent attention values for each entity, and determine a location for placing the virtual object based upon the selected entity. 20. The computing device of claim 19 , wherein the instructions are further executable to receive image data capturing the use environment, and identify the one or more entities based upon the image data.
Inference or reasoning models · CPC title
Machine learning · CPC title
Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry · CPC title
Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title
Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.