Eye tracking and gaze estimation using off-axis camera

US12487664B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12487664-B2
Application numberUS-202217674724-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2022
Priority dateAug 19, 2019
Publication dateDec 2, 2025
Grant dateDec 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques related to the computation of gaze vectors of users of wearable devices are disclosed. A neural network may be trained through first and second training steps. The neural network may include a set of feature encoding layers and a plurality of sets of task-specific layers that each operate on an output of the set of feature encoding layers. During the first training step, a first image of a first eye may be provided to the neural network, eye segmentation data may be generated using the neural network, and the set of feature encoding layers may be trained. During the second training step, a second image of a second eye may be provided to the neural network, network output data may be generated using the neural network, and the plurality of sets of task-specific layers may be trained.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of training a neural network, the method comprising: performing a first training step including: providing a first image of a first eye to the neural network as input, the neural network having a set of feature encoding layers connected to a plurality of sets of task-specific layers, the plurality of sets of task-specific layers including at least three sets of task-specific layers that operate on an output generated by the set of feature encoding layers, the plurality of sets of task-specific layers including: a first set of task-specific layers that output two-dimensional (2D) pupil data, a second set of task-specific layers that output eye segmentation data that includes a segmentation of an eye into a plurality of regions including one or more of a background region, a sclera region, a pupil region, or an iris region, and a third of task-specific layers that output cornea center data; generating, using the set of feature encoding layers and the second set of task-specific layers of the neural network and based on the first image of the first eye as input, eye segmentation data for the first eye that includes a segmentation of the first eye into the plurality of regions; and training the set of feature encoding layers using the eye segmentation data for the first eye by modifying weights associated with the set of feature encoding layers; and performing a second training step including: providing a second image of a second eye to the neural network as input; generating, using the set of feature encoding layers, the first set of task specific layers, and the third set of task-specific layers of the neural network and based on the second image of the second eye as input, network output data including 2D pupil data corresponding to the second eye and cornea center data corresponding to the second eye; and training the plurality of sets of task-specific layers using the network output data by modifying weights associated with the plurality of sets of task-specific layers; wherein the neural network is trained such that the set of feature encoding layers are trained during the first training step but are held fixed during the second training step. 2 . The method of claim 1 , wherein the first training step is performed during a first time duration and the second training step is performed during a second time duration that is after the first time duration. 3 . The method of claim 1 , wherein the plurality of regions includes one or more of a background region, a sclera region, a pupil region, or an iris region. 4 . The method of claim 1 , wherein performing the first training step further includes: training the second set of task-specific layers using the eye segmentation data for the first eye. 5 . The method of claim 1 , wherein performing the first training step further includes: receiving eye segmentation ground truth (GT) data; and comparing the eye segmentation data to the eye segmentation GT data. 6 . The method of claim 1 , wherein the network output data includes glint detection data corresponding to the second eye. 7 . The method of claim 1 , wherein the network output data includes a blink prediction corresponding to the second eye. 8 . The method of claim 1 , wherein the network output data includes an eye expression classification corresponding to the second eye. 9 . The method of claim 1 , wherein the network output data includes eye segmentation data for the second eye that includes a second segmentation of the second eye into the plurality of regions. 10 . A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors, cause the one or more processors to perform operations for training a neural network, wherein the operations comprise: performing a first training step including: providing a first image of a first eye to the neural network as input, the neural network having a set of feature encoding layers connected to a plurality of sets of task-specific layers, the plurality of sets of task-specific layers including at least three sets of task-specific layers that operate on an output generated by the set of feature encoding layers, the plurality of sets of task-specific layers including: a first set of task-specific layers that output two-dimensional (2D) pupil data, a second set of task-specific layers that output eye segmentation data that includes a segmentation of an eye into a plurality of regions including one or more of a background region, a sclera region, a pupil region, or an iris region, and a third of task-specific layers that output cornea center data; generating, using the set of feature encoding layers and the second set of task-specific layers of the neural network and based on the first image of the first eye as input, eye segmentation data for the first eye that includes a segmentation of the first eye into the plurality of regions; and training the set of feature encoding layers using the eye segmentation data for the first eye by modifying weights associated with the set of feature encoding layers; and performing a second training step including: providing a second image of a second eye to the neural network as input; generating, using the set of feature encoding layers, the first set of task specific layers, and the third set of task-specific layers of the neural network and based on the second image of the second eye as input, network output data including 2D pupil data corresponding to the second eye and cornea center data corresponding to the second eye; and training the plurality of sets of task-specific layers using the network output data by modifying weights associated with the plurality of sets of task-specific layers; wherein the neural network is trained such that the set of feature encoding layers are trained during the first training step but are held fixed during the second training step. 11 . The non-transitory computer-readable medium of claim 10 , wherein the first training step is performed during a first time duration and the second training step is performed during a second time duration that is after the first time duration. 12 . A system comprising: one or more processors; and a non-transitory computer-readable medium comprising instructions that, when executed by the one or more processors, cause the one or more processors to perform operations for training a neural network, wherein the operations comprise: performing a first training step including: providing a first image of a first eye to the neural network as input, the neural network having a set of feature encoding layers connected to a plurality of sets of task-specific layers, the plurality of sets of task-specific layers including at least three sets of task-specific layers that operate on an output generated by the set of feature encoding layers, the plurality of sets of task-specific layers including: a first set of task-specific layers that output two-dimensional (2D) pupil data, a second set of task-specific layers that output eye segmentation data that includes a segmentation of an eye into a plurality of regions including one or more of a background region, a sclera region, a pupil region, or an iris region, and a third of task-specific layers that output cornea center data; generating, using the set of feature encoding layers and the second set of task-specific layers of the neural network and based on the first image of the first eye as input, eye segmentation data for the first eye that includes a segmentation of the first eye into the plurality of regions; and training the set of feature encoding lay

Assignees

Inventors

Classifications

  • comprising image capture systems, e.g. camera · CPC title

  • characterised by optical features · CPC title

  • with means for monitoring data relating to the user, e.g. head-tracking, eye-tracking · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Preprocessing; Feature extraction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12487664B2 cover?
Techniques related to the computation of gaze vectors of users of wearable devices are disclosed. A neural network may be trained through first and second training steps. The neural network may include a set of feature encoding layers and a plurality of sets of task-specific layers that each operate on an output of the set of feature encoding layers. During the first training step, a first imag…
Who is the assignee on this patent?
Magic Leap Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/013. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).