Visual hierarchy design governed user interface modification via augmented reality
US-2021048938-A1 · Feb 18, 2021 · US
US11868523B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11868523-B2 |
| Application number | US-202117305219-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 1, 2021 |
| Priority date | Jul 1, 2021 |
| Publication date | Jan 9, 2024 |
| Grant date | Jan 9, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques of tracking a user's gaze includes identifying a region of a display at which a gaze of a user is directed, the region including a plurality of pixels. By determining a region rather than a point, when the regions correspond to elements of a user interface, the improved technique enables a system to activate the element to which a determined region is selected. In some implementations, the system makes the determination using a classification engine including a convolutional neural network; such an engine takes as input images of the user's eye and outputs a list of probabilities that the gaze is directed to each of the regions.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving image data representing at least one image of an eye of a user looking at a display at an instant of time, the display including a plurality of regions and being configured to operate in an augmented reality (AR) application, each of the plurality of regions including a plurality of pixels and corresponding to a respective element of a user interface; identifying, based on the image data, a region of the plurality of regions of the display at which a gaze of a user is directed at the instant of time, the identifying including inputting the at least one image of the eye of the user into a classification engine configured to classify the gaze as being directed to one of the plurality of regions; and activating an element of the user interface to which the identified region corresponds. 2. The method as in claim 1 , wherein the classification engine includes a first branch representing a convolutional neural network (CNN). 3. The method as in claim 1 , wherein the classification engine is configured to produce, as an output, a vector having a number of elements equal to a number of regions of the plurality of regions, each element of the vector including a number corresponding to a respective region of the plurality of regions, the number representing a likelihood of the gaze of the user being directed to the region to which the number corresponds. 4. The method as in claim 3 , wherein the classification engine includes a softmax layer configured to produce, as an output of the classification engine, as the likelihood corresponding to each region of the plurality of regions, a probability between zero and unity, and wherein identifying the region further includes: selecting, as the identified region, a region of the plurality of regions having a probability greater than the probability of each of the other regions of the plurality of regions. 5. The method as in claim 3 , wherein identifying the region further includes: generating image cluster data representing a set of image clusters corresponding the plurality of regions on the display, and wherein the classification engine includes a loss function based on distances from the set of image clusters. 6. The method as in claim 1 , further comprising: training the classification engine, the training being based on a mapping between images of the eye of the user and a region identifier identifying a region of the plurality of regions at which the gaze of the user is directed. 7. The method as in claim 1 , wherein the display is a transparent display embedded in smartglasses. 8. The method as in claim 7 , wherein the classification engine further includes a second branch representing a neural network, and wherein the method further comprises: outputting, from the second branch and based on the image data, a pose of the eye of the user with respect to a camera mounted on the smartglasses. 9. The method as in claim 8 , wherein the classification engine includes an attention layer, and wherein identifying the region further includes: causing the attention layer to adjust probabilities of the gaze being directed to the regions of the display based on the outputted pose of the eye. 10. The method as in claim 1 , wherein the user is a first user, wherein the classification engine further includes a second branch representing a neural network, and wherein the method further comprises: inputting into the second branch a parameter value indicating a difference between the first user and a second user; and causing the second branch to adjust probabilities of the gaze being directed to the regions of the display based on the parameter value. 11. The method as in claim 1 , wherein the user is a first user, wherein the classification engine further includes a second branch representing a neural network, and wherein the method further comprises: inputting into the second branch a parameter value indicating a geometrical configuration of the plurality of regions; and causing the second branch to adjust probabilities of the gaze being directed to the regions of the display based on the parameter value. 12. The method as in claim 1 , wherein the user is a first user, wherein the classification engine further includes a second branch representing a neural network, and wherein the method further comprises: inputting into the second branch a parameter value indicating a temporal smoothness of the image data; and causing the second branch to adjust probabilities of the gaze being directed to the regions of the display based on the parameter value. 13. A computer program product comprising a nontransitory storage medium, the computer program product including instructions that, when executed by processing circuitry, causes the processing circuitry to perform a method, the method comprising: receiving image data representing at least one image of an eye of a user looking at a display at an instant of time, the display including a plurality of regions and being configured to operate in an augmented reality (AR) application, each of the plurality of regions including a plurality of pixels and corresponding to a respective element of a user interface; identifying, based on the image data, a region of the plurality of regions of the display at which a gaze of a user is directed at the instant of time, the identifying including inputting the at least one image of the eye of the user into a classification engine configured to classify the gaze as being directed to one of the plurality of regions; and activating an element of the user interface to which the identified region corresponds. 14. The computer program product as in claim 13 , wherein the classification engine includes a first branch representing a convolutional neural network (CNN). 15. The computer program product as in claim 13 , wherein the classification engine is configured to produce, as an output, a number corresponding to each of the plurality of regions, the number representing a likelihood of the gaze of the user being directed to the region to which the number corresponds. 16. The computer program product as in claim 13 , wherein the method further comprises: training the classification engine, the training being based on a mapping between images of the eye of the user and a region identifier identifying a region of the plurality of regions at which the gaze of the user is directed. 17. The computer program product as in claim 13 , wherein the display is a transparent display embedded in smartglasses. 18. The computer program product as in claim 17 , wherein the classification engine further includes a second branch representing a neural network, and wherein the method further comprises: outputting, from the second branch and based on the image data, a pose of the eye of the user with respect to a camera mounted on the smartglasses. 19. The computer program product as in claim 18 , wherein the classification engine includes an attention layer, and wherein identifying the region further includes: causing the attention layer to adjust probabilities of the gaze being directed to the regions of the display based on the outputted pose of the eye. 20. An electronic apparatus, the electronic apparatus comprising: memory; and controlling circuitry coupled to the memory, the controlling circuitry being configured to: receive image data representing at least one image of an eye of a user looking at a display at an instant of
Eye tracking input arrangements (G06F3/015 takes precedence) · CPC title
with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking · CPC title
characterised by optical features (G02B27/0172 takes precedence) · CPC title
characterised by optical features · CPC title
Clustering techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.