Eye contact correction in real time using neural network based machine learning
US-2017308734-A1 · Oct 26, 2017 · US
US2018018451A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018018451-A1 |
| Application number | US-201715497927-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 26, 2017 |
| Priority date | Jul 14, 2016 |
| Publication date | Jan 18, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for iris authentication are disclosed. In one aspect, a deep neural network (DNN) with a triplet network architecture can be trained to learn an embedding (e.g., another DNN) that maps from the higher dimensional eye image space to a lower dimensional embedding space. The DNN can be trained with segmented iris images or images of the periocular region of the eye (including the eye and portions around the eye such as eyelids, eyebrows, eyelashes, and skin surrounding the eye). With the triplet network architecture, an embedding space representation (ESR) of a person's eye image can be closer to the ESRs of the person's other eye images than it is to the ESR of another person's eye image. In another aspect, to authenticate a user as an authorized user, an ESR of the user's eye image can be sufficiently close to an ESR of the authorized user's eye image.
Opening claim text (preview).
What is claimed is: 1 . A wearable display system comprising: a display; an image capture device configured to capture a first image of an eye of a user; non-transitory memory configured to store: an embedding for processing the first image of the eye, wherein the embedding is learned using a deep neural network with a triplet network architecture, wherein the deep neural network is configured to learn the embedding from eye images of a plurality of persons, and wherein a distance in the embedding space representation for eye images from the same person is smaller than a distance in the embedding space representation for eye images from different persons, a classifier for processing the processed first image of the eye, and executable instructions; and a hardware processor in communication with the display, the image capture device, and the non-transitory memory, the hardware processor programmed by the executable instructions to: receive the first image of the eye; process the first image of the eye using the embedding to generate an embedding space representation; process the embedding space representation using the classifier to calculate a likelihood score that the first image of the eye is an image of an eye of an authorized user; and grant or deny the user access to the wearable display system based on the likelihood score. 2 . The wearable display system of claim 1 , wherein the brightness normalization layer comprises a local contrast normalization layer, a local response normalization layer, or a combination thereof. 3 . The wearable display system of claim 1 , wherein the deep neural network comprises a plurality of layers, and wherein the plurality of layers comprises a pooling layer, a brightness normalization layer, a convolutional layer, an inception-like layer, a rectified linear layer, a softsign layer, or any combination thereof, 4 . The wearable display system of claim 1 , wherein the embedding space representation has unit length. 5 . The wearable display system of claim 1 , wherein the classifier generates the likelihood score based on the Euclidian distance. 6 . The wearable display system of claim 1 , wherein the classifier is a binary classifier, a logistic regression classifier, a support vector machine classifier, a Bayesian classifier, a softmax classifier, or any combination thereof. 7 . The wearable display system of claim 1 , wherein the hardware processor is programmed by the executable instructions to: segment the first image of the eye to generate a second image of an iris of the eye, and wherein to process the first image of the eye, the hardware processor is programmed by the executable instructions to: process the second image of the iris of the eye using the embedding to generate the embedding space representation. 8 . A system for training a deep neural network for iris authentication, comprising: computer-readable memory storing executable instructions; and one or more hardware-based hardware processors programmed by the executable instructions to at least: access a deep neural network comprising a plurality of layers, wherein each layer of the plurality of layers is connected to at least another layer of the plurality of layers; provide the deep neural network with a training set comprising eye images of a plurality of persons; compute embedding space representations of the plurality of eye images using the deep neural network, wherein the embedding space representations of the plurality of eye images of the same person are within a threshold; and update the deep neural network based on the distances between the embedding space representations of eye images of the same persons and different persons. 9 . The system of claim 8 , wherein the plurality of layers comprises a pooling layer, a brightness normalization layer, a convolutional layer, an inception-like layer, a rectified linear layer, a softsign layer, or any combination thereof. 10 . The system of claim 8 , wherein the deep neural network comprises a triplet network architecture. 11 . The system of claim 10 , wherein the training set comprises triplets of eye images, and wherein the deep neural network is learned using the triplets of eye images 12 . The system of claim 11 , where two eye images of the triplet are from the same person and the third eye image of the triplet is from a different person. 13 . A head mounted display system comprising: a display; an image capture device configured to capture a first image of an eye of a user; non-transitory memory configured to store executable instructions; and a hardware processor in communication with the display, the image capture device, and the non-transitory memory, the hardware processor programmed by the executable instructions to: receive the first image of the eye; process the first image of the eye to generate a representation of the first image of the eye in polar coordinates; process the representation of the first image of the eye in polar coordinates using a deep neural network to generate an embedding space representation; and process the embedding space representation using a classifier to generate a likelihood score that the image of the eye is an image of the authorized user's eye. 14 . The head mounted display system of claim 13 , wherein the deep neural network is learned using a triplet network. 15 . The head mounted display system of claim 14 , wherein the triplet network is configured to learn the deep neural network from eye images of a plurality of persons, and wherein a distance in the embedding space representation for eye images from the same person is smaller than a distance in the embedding space representation for eye images from different persons. 16 . The head mounted display system of claim 13 , wherein the hardware processor is programmed by the executable instructions to: grant or deny the user access to the head mounted display system based on the likelihood score. 17 . The head mounted display system of claim 13 , wherein the hardware processor is programmed by the executable instructions to: segment the first image of the eye to generate a second image of an iris of the eye, and wherein to process the first image of the eye, the hardware processor is programmed by the executable instructions to: process the second image of the iris of the eye using the deep neural network to generate the embedding space representation. 18 . The head mounted display system of claim 13 , wherein the first image of the eye comprises mostly of the iris and the retina of the eye. 19 . The head mounted display system of claim 13 , wherein the first image of the eye comprises mostly of the retina of the eye. 20 . The head mounted display system of claim 13 , wherein the embedding space representation is an n-dimensional vector, and wherein the majority of the elements of the embedding space representation are statistically independent.
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Combinations of networks · CPC title
using biometrical features, e.g. fingerprint, retina-scan (cryptographic mechanisms or cryptographic arrangements for entity authentication using biological data H04L9/3231) · CPC title
using biometric data, e.g. fingerprints, iris scans or voiceprints · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.