Autoencoding generative adversarial network for augmenting training data usable to train predictive models
US-2021256353-A1 · Aug 19, 2021 · US
US12436608B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12436608-B2 |
| Application number | US-202218072237-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 30, 2022 |
| Priority date | Dec 2, 2021 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An electronic device and method with gaze estimating are disclosed. The method includes obtaining target information of an image, the image including an eye, obtaining a target feature map representing information on the eye in the image based on the target information, and estimating a gaze for the eye in the image based on the target feature map. The target information includes either attention information on the image, or a distance between pixels in the image, or both.
Opening claim text (preview).
What is claimed is: 1. A method performed by an electronic device, the method comprising: obtaining target information of an image, the image comprising an eye; obtaining a target feature map representing information on the eye in the image, by extracting features from a first feature map of at least two frame images and the target information based on an offset between pixels of a face in the image and a first front image obtained by offsetting the pixels of the face in the image and applying a facial mask covering a region other than the face in the image to the image; and performing gaze estimation for the eye in the image based on the target feature map, wherein the target information comprises either attention information on the image, or a distance between pixels in the image, or both, wherein the attention information comprises temporal relationship information between the at least two frame images and frontal facial features of the face or a head, and wherein the frontal facial features are determined based on obtaining a facial map and the facial mask of the image. 2. The method of claim 1 , wherein the obtaining of the target feature map comprises: obtaining the target feature map of the image based on the first feature map of the at least two frame images and the temporal relationship information between the at least two frame images. 3. The method of claim 1 , wherein the obtaining of the target feature map comprises: obtaining the target feature map based on a second feature map of a specific portion of the image and the frontal facial features, wherein the specific portion comprises one or at least two of eye, mouth, nose, ear, and eyebrow portions of the face or the head. 4. The method of claim 1 , wherein the obtaining of the target feature map comprises: obtaining a third feature map of the image based on the frontal facial features and a second feature map of a portion of the image; and obtaining the target feature map based on a third feature map of the at least two frame images and the temporal relationship information between the at least two frame images. 5. The method of claim 4 , wherein the frontal facial features are determined based on: obtaining the first front image based on the image, the facial map, and the facial mask; and obtaining the frontal facial features based on the first front image, wherein the facial map comprises the offset of each pixel of the face in the image. 6. The method of claim 5 , wherein the obtaining of the first front image comprises: obtaining, based on the image, the facial map, and the facial mask, a second front image comprising a region of facial data, the region of facial data surrounding a hole region that lacks facial data; obtaining a hole mask of the second front image and a third front image, based on the second front image; and obtaining the first front image based on the second front image, the hole mask, and the third front image, wherein the hole mask masks an image region other than the hole region in the second front image, and the third front image comprises an image region corresponding to a position of the hole region in the second front image. 7. The method of claim 1 , wherein the target information comprises the distance between pixels, and wherein the obtaining of the target feature map comprises: obtaining the target feature map based on a fourth feature map of the image and relative distance information between the pixels. 8. The method of claim 1 , wherein the target information comprises weight information, and wherein the obtaining of the target information comprises obtaining a first weight map of the image based on a fifth feature map of the image, and wherein the obtaining of the target feature map comprises obtaining the target feature map based on the first weight map and the fifth feature map. 9. The method of claim 1 , wherein the attention information comprises weight information, and wherein the obtaining of the target information comprises obtaining a second weight map based on a position of the eye in the image, and wherein the obtaining of the target feature map comprises obtaining the target feature map based on the second weight map and a sixth feature map of the image, and wherein the sixth feature map is obtained by extracting features from the image through at least two convolutional layers. 10. The method of claim 9 , wherein the obtaining of the target feature map comprises: obtaining a seventh feature map based on the second weight map and an intermediate feature map; and obtaining the target feature map based on the sixth feature map and the seventh feature map, wherein the intermediate feature map is a feature map output by a target layer among the at least two convolutional layers. 11. The method of claim 1 , wherein the performing of the gaze estimation comprises performing the gaze estimation on the image based on the target feature map and target pose information, and wherein the target pose information is pose information of a target portion in the image. 12. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 . 13. An electronic device, comprising: a processor; and a memory comprising instructions executable by the processor, wherein, when the instructions are executed by the processor, the processor is configured to obtain target information of an image comprising an eye, obtain a target feature map representing information on the eye in the image by extracting features from a first feature map of at least two frame images and the target information based on an offset between pixels of a face in the image and a first front image obtained by offsetting the pixels of the face in the image and applying a facial mask covering a region other than the face in the image to the image, and perform gaze estimation for the eye in the image based on the target feature map, wherein the target information comprises any either attention information on the image, or a distance between pixels in the image, or both, wherein the attention information comprises temporal relationship information between the at least two frame images and frontal facial features of the face or a head, and wherein the frontal facial features are determined based on obtaining a facial map and the facial mask of the image. 14. The electronic device of claim 13 , ; wherein when the instructions are executed by the processor, the processor is further configured to: obtain the target feature map of the image based on the first feature map of the at least two frame images and the temporal relationship information between the at least two frame images. 15. The electronic device of claim 13 , wherein the processor is configured to: obtain the target feature map based on a second feature map derived from a specific portion of the image and based on the frontal facial features, wherein the specific portion comprises eye, mouth, nose, ear, or eyebrow portions of the face or head. 16. The electronic device of claim 13 , wherein when the instructions are executed by the processor, the processor is further configured to: obtain a third feature map of the image based on the frontal facial features and a second feature map of a specific portion of the image; and obtain the target feature map based on a third feature map of the at least two frame images and the temporal relationship information between the at least two frame images. 17. The electronic de
Face · CPC title
relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking · CPC title
Proximity, similarity or dissimilarity measures · CPC title
Local features and components; Facial parts (eye characteristics G06V40/18); Occluding parts, e.g. glasses; Geometrical relationships · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.