Location-based entity selection using gaze tracking

US11429186B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11429186-B2
Application numberUS-202016951940-A
CountryUS
Kind codeB2
Filing dateNov 18, 2020
Priority dateNov 18, 2020
Publication dateAug 30, 2022
Grant dateAug 30, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One example provides a computing device comprising instructions executable to receive information regarding one or more entities in the scene, to receive eye tracking a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user and, based at least on the eye tracking samples, determine a time-dependent attention value for each entity of the one or more entities at different locations in a use environment, the time-dependent attention value determined using a leaky integrator. The instructions are further executable to receive a user input indicating an intent to perform a location-dependent action, associate the user input to with a selected entity based at least upon the time-dependent attention value for each entity, and perform the location-dependent action based at least upon a location of the selected entity.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing device comprising: a logic subsystem; and a storage subsystem holding instructions executable by the logic machine to: receive information regarding one or more entities in a scene, receive, via an eye-tracking sensor, a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user, based at least on the eye tracking samples, determine a time-dependent attention value for each entity of the one or more entities, each time-dependent attention value decaying over a period of time when a user looks away from a corresponding entity, receive a user input indicating an intent to perform a location-dependent action, and associate the user input with a selected entity based at least upon the time-dependent attention value for the selected entity. 2. The computing device of claim 1 , wherein the computing device comprises a head-mounted computing device comprising one or more of the eye-tracking sensor, a depth camera, and a microphone. 3. The computing device of claim 1 , wherein the instructions are executable to receive information on the one or more entities by receiving, from a depth camera, a depth image of a scene, and based on the depth image, identifying one or more entities in the scene. 4. The computing device of claim 1 , wherein the instructions are executable to determine the time-dependent attention value for each entity of the one or more entities using a leaky integrator. 5. The computing device of claim 1 , wherein instructions executable to associate the user input to the selected entity comprises instructions executable to: input, to a trained machine-learning model, the user input, the one or more entities, and the time-dependent attention values, receive, from the trained machine-learning model, a likelihood for the association of the user input with the selected entity, and associate the user input with the selected entity based at least on the likelihood. 6. The computing device of claim 1 , wherein the instructions are executable to assign a timestamp for the user input and associate the user input with the selected entity by comparing the timestamp for the user input to a timestamp for the time-dependent attention value for the selected entity. 7. The computing device of claim 6 , wherein the selected entity is a first selected entity, wherein the user input comprises a first location-dependent term and a second location-dependent term, and wherein the instructions are executable to associate the second location-dependent term with a second selected entity based upon a time-dependent attention value for the second selected entity. 8. The computing device of claim 6 , wherein the instructions are further executable to store, for each entity, a plurality of time-dependent attention values, each time-dependent attention value for the entity corresponding to a different timestamp. 9. The computing device of claim 1 , wherein the selected entity comprises a real-world object or a virtual object. 10. The computing device of claim 1 , wherein the location-dependent action comprises placing a virtual object. 11. The computing device of claim 1 , wherein the selected entity comprises a virtual object representing an application, and the location-dependent action comprises controlling the application. 12. On a computing device, a method comprising: receiving a depth image of a scene; based on the depth image, identifying one or more entities in the scene; receiving a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user; based at least on the eye tracking samples, determining time-dependent attention value for each entity of the one or more entities, the time-dependent attention values determined using a leaky integrator such that each time-dependent attention value decays over a period of time when a user looks away from a corresponding entity; receiving a user input indicating an intent to perform a location-dependent action; associating the user input with a selected entity of the one or more entities based upon the time-dependent attention value for each entity; and performing the location-dependent action based at least upon the selected entity. 13. The method of claim 12 , wherein the user input comprises one or more of a gesture input and a speech input. 14. The method of claim 12 , wherein associating the user input to the selected entity comprises: inputting, to a trained machine-learning model, the user input, the one or more entities, and the time-dependent attention values, receiving, from the trained machine-learning model, a likelihood for the association of the user input with the selected entity, and associating the user input with the selected entity based at least on the likelihood. 15. The method of claim 12 , wherein the selected entity comprises one or more of a real-world object and a virtual object. 16. The method of claim 12 , wherein the selected entity comprises a virtual object, and wherein the location-dependent action comprises one or more of moving the virtual object and controlling an application represented by the virtual object. 17. The method of claim 12 , wherein the depth image is received from a depth camera remote to the computing system. 18. The method of claim 12 , wherein associating the user input to the selected entity comprises comparing time-dependent attention values for the one or more entities to a timestamp of the user input. 19. A computing device comprising: a logic machine; and a storage subsystem holding instructions executable by the logic machine to receive a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user, based at least on the eye tracking samples, determine time-dependent attention values for each entity of one or more entities at different locations in a use environment, the time-dependent attention values determined using a leaky integrator such that each time-dependent attention value decays over a period of time when a user looks away from a corresponding entity, determine a selected entity of the one or more entities based upon the time-dependent attention values for each entity, and determine a location for placing the virtual object based upon the selected entity. 20. The computing device of claim 19 , wherein the instructions are further executable to receive image data capturing the use environment, and identify the one or more entities based upon the image data.

Assignees

Inventors

Classifications

  • Inference or reasoning models · CPC title

  • Machine learning · CPC title

  • Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry · CPC title

  • G06F3/013Primary

    Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title

  • Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11429186B2 cover?
One example provides a computing device comprising instructions executable to receive information regarding one or more entities in the scene, to receive eye tracking a plurality of eye tracking samples, each eye tracking sample corresponding to a gaze direction of a user and, based at least on the eye tracking samples, determine a time-dependent attention value for each entity of the one or mo…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/013. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 30 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).