Face-based frame rate upsampling for video calls

US11869274B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11869274-B2
Application numberUS-202217707661-A
CountryUS
Kind codeB2
Filing dateMar 29, 2022
Priority dateAug 7, 2019
Publication dateJan 9, 2024
Grant dateJan 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes receiving a set of video frames that correspond to a video, including a first video frame and a second video frame that each include a face, wherein the second video frame is subsequent to the first video frame. The method further includes performing face tracking on the first video frame to identify a first face resampling keyframe and performing face tracking on the second video frame to identify a second face resampling keyframe. The method further includes deriving an interpolation amount. The method further includes determining a first interpolated face frame based on the first face resampling keyframe and the interpolation amount. The method further includes determining a second interpolated face frame based on the second face resampling keyframe and the interpolation amount. The method further includes rendering an interpolated first face and an interpolated second face. The method further includes displaying a final frame.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving a set of video frames that correspond to a video, the set of video frames including a first video frame and a second video frame that each include a face, wherein the second video frame is subsequent to the first video frame; performing face tracking on the first video frame to identify a first face resampling keyframe; performing face tracking on the second video frame to identify a second face resampling keyframe; deriving an interpolation amount; interpolating a first background of the first face resampling keyframe and a second background of the second face resampling keyframe based on the interpolation amount; rendering an interpolated first face and an interpolated second face; and displaying a final frame that is based on the interpolated first background, the interpolated second background, the interpolated first face, and the interpolated second face. 2. The method of claim 1 , further comprising: blending the first background with the second background to obtain a blended background; blending the interpolated first face with the interpolated second face to obtain a blended interpolated face; and generating the final frame by placing a smooth face on top of the blended interpolated face and the blended background. 3. The method of claim 1 , wherein the first face resampling keyframe includes a first head transform matrix and first face landmark vertices and further comprising determining a first interpolated face frame by: using the first head transform matrix to extract a translation vector, a rotation quaternion, and a scale vector; linearly interpolating the translation vector; using a linear interpolation to interpolate the rotation quaternion to generate an interpolated rotation quaternion; linearly interpolating the scale vector to generate an interpolated scale vector; composing an interpolated translation-rotation-scale matrix based on the interpolated translation vector, the interpolated rotation quaternion, and the interpolated scale vector; and calculating an interpolated position for the interpolated first face using the interpolated translation-rotation-scale matrix. 4. The method of claim 1 , wherein the interpolation amount is derived from (a) a duration between the first face resampling keyframe and the second face resampling keyframe and (b) a current render time. 5. The method of claim 1 , wherein the second face resampling keyframe includes a second head transform matrix and second face landmark vertices, and further comprising determining a second interpolated face frame by calculating a respective displacement for each vertex in the second face landmark vertices. 6. The method of claim 1 , wherein the interpolating the first background and the second background is done with alpha blending. 7. The method of claim 1 , wherein the rendering includes at least one of feathering of edges of the interpolated first face and the interpolated second face or fading between a first interpolated face frame and a second interpolated face frame based on the interpolation amount. 8. The method of claim 1 , wherein: the first face resampling keyframe includes a first head transform matrix and first face landmark vertices; performing face tracking on the first video frame further includes determining first texture coordinates for the first face resampling keyframe and a timestamp; and the first texture coordinates are applied to the first face landmark vertices. 9. The method of claim 1 , wherein the first face resampling keyframe is identified by performing red green blue (RGB) face tracking on the first video frame. 10. A non-transitory computer-readable medium with instructions stored thereon that, when executed by one or more computers, cause the one or more computers to perform operations, the operations comprising: receiving a set of video frames that correspond to a video, the set of video frames including a first video frame and a second video frame that each include a face, wherein the second video frame is subsequent to the first video frame; performing face tracking on the first video frame to identify a first face resampling keyframe; performing face tracking on the second video frame to identify a second face resampling keyframe; deriving an interpolation amount; interpolating a first background of the first face resampling keyframe and a second background of the second face resampling keyframe based on the interpolation amount; rendering an interpolated first face and an interpolated second face; and displaying a final frame that is based on the interpolated first background, the interpolated second background, the interpolated first face, and the interpolated second face. 11. The computer-readable medium of claim 10 , wherein the operations further comprise: blending the first background with the second background to obtain a blended background; blending the interpolated first face with the interpolated second face to obtain a blended interpolated face; and generating the final frame by placing a smooth face on top of the blended interpolated face and the blended background. 12. The computer-readable medium of claim 10 , wherein the first face resampling keyframe includes a first head transform matrix and first face landmark vertices and further comprising determining a first interpolated face frame by: using the first head transform matrix to extract a translation vector, a rotation quaternion, and a scale vector; linearly interpolating the translation vector; using a linear interpolation to interpolate the rotation quaternion to generate an interpolated rotation quaternion; linearly interpolating the scale vector to generate an interpolated scale vector; composing an interpolated translation-rotation-scale matrix based on the interpolated translation vector, the interpolated rotation quaternion, and the interpolated scale vector; and calculating an interpolated position for the interpolated first face using the interpolated translation-rotation-scale matrix. 13. The computer-readable medium of claim 10 , wherein the interpolation amount is derived from (a) a duration between the first face resampling keyframe and the second face resampling keyframe and (b) a current render time. 14. The computer-readable medium of claim 10 , wherein the second face resampling keyframe includes a second head transform matrix and second face landmark vertices, and the operations further include determining a second interpolated face frame by calculating a respective displacement for each vertex in the second face landmark vertices. 15. The computer-readable medium of claim 10 , wherein interpolating the first background and the second background is done with alpha blending. 16. A system comprising: one or more processors; and a memory that stores instructions that, when executed by the one or more processors cause the one or more processors to perform operations comprising: receiving a set of video frames that correspond to a video, the set of video frames including a first video frame and a second video frame that each include a face, wherein the second video frame is subsequent to the first video frame; performing face tracking on the first video frame to identify a first face resampling keyframe; performing face tracking on the second video frame to identify a second face resampling keyframe; deriving an interpolation amount; interpolating a first background of the first face resampling keyframe and a second background of the second face resampling keyfr

Assignees

Inventors

Classifications

  • G06V40/172Primary

    Classification, e.g. identification · CPC title

  • based on interpolation, e.g. bilinear interpolation (image demosaicing G06T3/4015; edge-driven or edge-based scaling G06T3/403) · CPC title

  • Physics · mapped topic

  • using two or more images, e.g. averaging or subtraction · CPC title

  • involving reference images or patches · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11869274B2 cover?
A method includes receiving a set of video frames that correspond to a video, including a first video frame and a second video frame that each include a face, wherein the second video frame is subsequent to the first video frame. The method further includes performing face tracking on the first video frame to identify a first face resampling keyframe and performing face tracking on the second v…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06V40/172. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).