Periocular and audio synthesis of a full face image
US-2018137678-A1 · May 17, 2018 · US
US12379778B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12379778-B2 |
| Application number | US-202418627695-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 5, 2024 |
| Priority date | Nov 14, 2020 |
| Publication date | Aug 5, 2025 |
| Grant date | Aug 5, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
System and method that utilize windowing for efficient capturing of facial landmarks include an inward-facing head-mounted camera that captures images of a region on a user's face utilizing a sensor supporting changing of its region of interest (ROI). The system also includes a computer that detects, based on the images, a type of facial expression expressed by the user, which belongs to a group comprising first and second facial expressions. Responsive to detecting that the user expresses the first facial expression, the computer reads from the camera a first ROI that covers a first subset of facial landmarks relevant to the first facial expression. Responsive to detecting that the user expresses the second facial expression, the computer reads from the camera a second ROI that covers a second subset of facial landmarks relevant to the second facial expression, with the first and second ROIs being different.
Opening claim text (preview).
We claim: 1. A system configured to utilize windowing for efficient capturing of facial landmarks, comprising: an inward-facing head-mounted camera configured to capture images of a region on a user's face utilizing a sensor that supports changing of its region of interest (ROI); and a computer configured to: detect, based on the images, a type of facial expression expressed by the user, which belongs to a group comprising first and second facial expressions; responsive to detecting that the user expresses the first facial expression, read from the camera a first ROI that covers a first subset of facial landmarks relevant to the first facial expression; and responsive to detecting that the user expresses the second facial expression, read from the camera a second ROI that covers a second subset of facial landmarks relevant to the second facial expression; wherein the first and second ROIs are different. 2. The system of claim 1 , wherein the computer is further configured to select the first subset as follows: calculate first relevance scores for facial landmarks extracted from a first subset of the images, select a first proper subset of the facial landmarks whose relevance scores reach a first threshold, and set the first ROI to cover the first proper subset of the facial landmarks. 3. The system of claim 2 , wherein the computer is further configured to select the second subset as follows: calculate second relevance scores for facial landmarks extracted from a second subset of the images, select a second proper subset of the facial landmarks whose relevance scores reach a second threshold, and set the second ROI to cover the second proper subset of the facial landmarks. 4. The system of claim 1 , wherein the computer is further configured to select the first and second ROIs based on a pre-calculated function and/or a lookup table that maps between types of facial expressions and their corresponding ROIs. 5. The system of claim 1 , wherein total power consumed from head-mounted components for a process of rendering an avatar based on the first and second ROIs is lower than total power that would have been consumed from the head-mounted components for a process of rendering the avatar based on images of the region. 6. The system of claim 1 , wherein the system further comprises a head-mounted acoustic sensor configured to take audio recordings of the user and a head-mounted movement sensor configured to measure movements of the user's head; and the computer is further configured to (i) generate feature values based on data read from the camera, the audio recordings, and the movements, and (ii) utilize a machine learning-based model to render an avatar of the user based on the feature values. 7. The system of claim 1 , wherein each of the first and second ROIs covers less than half of the region; and wherein the computer is further configured to detect changes in locations of the facial landmarks in the first and second subsets due to facial movements and/or movements of the camera relative to the user's face, and to update each of the first and second ROIs according to the changes. 8. The system of claim 1 , wherein the sensor further supports at least two different binning values for at least two different ROIs, respectively, and the computer is further configured to (i) select, based on performance metrics of facial expression analysis configured to detect the type of facial expression expressed by the user, first and second resolutions for the first and second ROIs, respectively, and (ii) set different binning values for the first and second ROIs according to the first and second resolutions. 9. The system of claim 1 , wherein the sensor further supports changing its binning value, wherein the computer is further configured to calculate relevance scores for facial landmarks extracted from overlapping sub-regions having at least two different binning values; wherein the sub-regions are subsets of the region, and a relevance score per facial landmark at a binning value increases as accuracy of facial expression detection based on the facial landmark at the binning value increases and power consumption used for the facial expression detection decreases; and set the binning values according to a function that optimizes the relevance scores. 10. The system of claim 9 , wherein the computer is configured to increase the relevance scores in proportion to an expected magnitude of movement of the facial landmarks, in order to prefer a higher binning for facial expressions causing larger movements of their respective facial landmarks. 11. The system of claim 1 , wherein the camera is physically coupled to a frame configured to be worn on the user's head, the camera is located less than 15 cm away from the user's face, and the computer is further configured to render an avatar of the user based on data read from the camera. 12. The system of claim 11 , wherein the system is further configured to reduce power consumption of its head-mounted components by checking quality of predictions of locations of facial landmarks using a model, and if the locations of the facial landmarks are closer than a threshold to their expected locations, then a bitrate at which the camera is read is reduced. 13. The system of claim 12 , wherein the computer is further configured to identify that the locations of the facial landmarks are not closer than the threshold to their expected locations, and then increase the bitrate at which the camera is read. 14. A method comprising: capturing images of a region on a user's face utilizing an inward-facing head-mounted camera comprising a sensor that supports changing of its region of interest (ROI); detecting, based on the images, a type of facial expression expressed by the user, which belongs to a group comprising first and second facial expressions; responsive to detecting that the user expresses the first facial expression, reading from the camera a first ROI that covers a first subset of facial landmarks relevant to the first facial expression; and responsive to detecting that the user expresses the second facial expression, reading from the camera a second ROI that covers a second subset of facial landmarks relevant to the second facial expression; wherein the first and second ROIs are different. 15. The method of claim 14 , further comprising: calculating first relevance scores for facial landmarks extracted from a first subset of the images, selecting a first proper subset of the facial landmarks whose relevance scores reach a first threshold, and setting the first ROI to cover the first proper subset of the facial landmarks. 16. The method of claim 14 , wherein the sensor further supports at least two different binning values for at least two different ROIs, respectively; and further comprising (i) selecting, based on performance metrics of facial expression analysis for detecting the type of facial expression expressed by the user, first and second resolutions for the first and second ROIs, respectively, and (ii) setting different binning values for the first and second ROIs according to the first and second resolutions. 17. The method of claim 14 , further comprising reducing power consumption of head-mounted components related to the camera by checking quality of predictions of locations of facial landmarks using a model, and if the locations of the facial landmarks are closer than a threshold to their expected locations, then reducing bitrate at which the camera is read. 18. A non-transitory computer readable medium storing one or more comp
Determining signal validity, reliability or quality (preventing, reducing or removing noise induced by motion artefacts A61B5/7207; noise originating from a therapeutic or surgical apparatus A61B5/7217) · CPC title
using photoplethysmograph signals, e.g. generated by infrared radiation (A61B5/14552 takes precedence) · CPC title
by combining or binning pixels · CPC title
by using two or more images to influence resolution, frame rate or aspect ratio · CPC title
for reducing power consumption by affecting camera operations, e.g. sleep mode, hibernation mode or power off of selective parts of the camera · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.