Eye contact correction in real time using neural network based machine learning

US10423830B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10423830-B2
Application numberUS-201615136618-A
CountryUS
Kind codeB2
Filing dateApr 22, 2016
Priority dateApr 22, 2016
Publication dateSep 24, 2019
Grant dateSep 24, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques related to eye contact correction to provide a virtual user gaze aligned with a camera while the user views a display are discussed. Such techniques may include encoding an eye region of a source image using a pretrained neural network to generate compressed features, applying a pretrained classifier to the features to determine a motion vector field for the eye region, and warping and inserting the eye region into the source image to generate an eye contact corrected image.

First claim

Opening claim text (preview).

What is claimed is: 1. A machine implemented method for providing eye contact correction comprising: receiving input image data of an already detected eye region of a source image; encoding, via a pretrained neural network, the image data of the eye region comprising: inputting the image data of the eye region into the neural network and without a remainder of a face forming the eye region from the source image, and outputting compressed features at a layer of the neural network and corresponding to the eye region of the source image and being decodable to form image data that can be compared to image data of the source image; applying a pretrained classifier to the compressed features to provide classifier output that is a motion vector field comprising multiple motion vectors that each indicate motion of pixel image data from a single pixel location from one image with an eye gaze in one direction and to a pixel location of a second image with an eye gaze in a different direction than the one direction, and for the eye region of the source image to move image data on a version of the source image; and warping the eye region of the source image based on the motion vector field and integrating the warped eye region into a remaining portion of the source image to generate an eye contact corrected image. 2. The method of claim 1 , wherein the pretrained neural network comprises a plurality of layers including at least one convolutional neural network layer. 3. The method of claim 1 , wherein the pretrained neural network comprises four layers, the four layers comprising, in order: a first convolutional neural network layer, a second convolutional neural network layer, a first fully connected layer, and a second fully connected layer, wherein the second fully connected layer provides the compressed features. 4. The method of claim 1 , wherein the pretrained classifier comprises a pretrained random forest classifier having a leaf corresponding to the motion vector field. 5. The method of claim 1 , further comprising: providing face detection and face landmark detection on the source image; and cropping the source image based on the face detection and the face landmark detection to generate the eye region. 6. The method of claim 1 , further comprising: encoding and transmitting the final image to a remote device for presentment to a user. 7. The method of claim 1 , wherein said encoding of the eye region and said applying the pretrained classifier to the compressed features to determine the motion vector field for the eye region of the source image are selectively provided based on a camera and a display having a first relative position therebetween, the method further comprising, when the camera and the display have a second relative position therebetween: encoding, via a second pretrained neural network, the eye region of the source image to generate second compressed features corresponding to the eye region of the source image; applying a second pretrained classifier to the second compressed features to determine a second motion vector field for the eye region of the source image; and warping the eye region of the source image based on the second motion vector field and integrating the warped eye region into the remaining portion of the source image to generate the eye contact corrected image. 8. The method of claim 1 , further comprising: receiving a plurality of pairs of training eye region images, wherein first images of the pairs of training eye region images have a gaze angle difference with respect to second images of the pairs of training eye region images; and training the pretrained neural network based on an encode of the first images by the pretrained neural network to generate training stage compressed features, a decode of the training stage compressed features to generate resultant first images corresponding to the first images, and a scoring of the first images and the resultant first images. 9. The method of claim 8 , wherein the scoring of the first images and the resultant first images comprises an error between the first images and the resultant first images. 10. The method of claim 8 , wherein the scoring of the first images and the resultant first images comprises: vertically filtering the first images and the resultant first images to generate vertically filtered first images and vertically filtered resultant first images, respectively; and determining an error between the vertically filtered first images and the vertically filtered resultant first images. 11. The method of claim 8 , further comprising: generating a likelihood map for each pixel of each of the first images of the pairs of training eye region images, wherein each likelihood map comprises a sum of absolute differences for each of a plurality of candidate motion vectors corresponding to the pixel; and training a training stage pretrained classifier based on the likelihood maps and the training stage compressed features. 12. The method of claim 11 , further comprising: compressing the training stage pretrained classifier by parameterized surface fitting to generate the pretrained classifier. 13. A system for providing eye contact correction comprising: a memory configured to store a source image; and a processor coupled to the memory, the processor to encode, via a pretrained neural network, an eye region of a source image, the encoding comprising: receiving input image data of an already detected eye region of the source image; inputting the image data of the eye region into the neural network and without a remainder of a face forming the eye region from the source image, and outputting compressed features at a layer of the neural network and corresponding to the eye region of the source image and being decodable to form image data that can be compared to image data of the source image, the processor to apply a pretrained classifier to the compressed features to provide classifier output that is a motion vector field comprising multiple motion vectors that each indicate motion of pixel image data from a pixel location from one image with an eye gaze in one direction and to a pixel location of a second image with an eye gaze in a different direction than the one direction, and for the eye region of the source image to move image data on a version of the source image, and to warp the eye region of the source image based on the motion vector field and integrate the warped eye region into a remaining portion of the source image to generate an eye contact corrected image. 14. The system of claim 13 , wherein the pretrained neural network comprises a plurality of layers including at least one convolutional neural network layer. 15. The system of claim 13 , wherein the pretrained classifier comprises a pretrained random forest classifier having a leaf corresponding to the motion vector field. 16. The system of claim 13 , wherein to encode the eye region and to apply the pretrained classifier to the compressed features to determine the motion vector field for the eye region of the source image are selectively provided based on a camera and a display having a first relative position therebetween, the processor further, when the camera and the display have a second relative position therebetween, to encode, via a second pretrained neural network, the eye region of the source image to generate second compressed features corresponding to the eye region of the source image, to apply a second pretrained classifier to the second compressed features to determine a second motion vector field for the

Assignees

Inventors

Classifications

  • G06V10/82Primary

    using neural networks · CPC title

  • Eye characteristics, e.g. of the iris · CPC title

  • Preprocessing; Feature extraction · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Classification techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10423830B2 cover?
Techniques related to eye contact correction to provide a virtual user gaze aligned with a camera while the user views a display are discussed. Such techniques may include encoding an eye region of a source image using a pretrained neural network to generate compressed features, applying a pretrained classifier to the features to determine a motion vector field for the eye region, and warping a…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 24 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).