What technology area does this patent fall under?

Primary CPC classification H04N9/8715. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Media effects using predicted facial feature locations

US10778939B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10778939-B2
Application number	US-201715713596-A
Country	US
Kind code	B2
Filing date	Sep 22, 2017
Priority date	Sep 22, 2017
Publication date	Sep 15, 2020
Grant date	Sep 15, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An effects application receives a video of a face and detects a bounding box for each frame indicating the location and size of the face in each frame. In one or more reference frames. The application uses an algorithm to determine locations of facial features in the frame. The application then normalizes the feature locations relative to the bounding box and saves the normalized feature locations. In other frames (e.g., target frames), the application obtains the bounding box and then predicts the locations of the facial features based on the size and location of the bounding box and the normalized feature locations calculated in the reference frame. The predicted locations can be made available to an augmented reality function that overlays graphics in a video stream based on face tracking in order to apply a desired effect to the video.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: receiving a video comprising a sequence of video frames; detecting a face of a subject in a target video frame of the sequence of video frames; obtaining a target bounding box for the detected face in the target video frame, the target bounding box having edges aligning with detected outermost pixels of the detected face; obtaining a reference bounding box for the detected face in a reference video frame of the video; applying a feature detection algorithm to detect location of facial features for the detected face in the reference video frame; determining a horizontal offset of a first location of a given facial feature from a vertical edge of the reference bounding box; determining a normalized first location as a ratio of the horizontal offset to a width of the reference bounding box; determining a vertical offset of a first location of the given facial feature from a horizontal edge of the reference bounding box; determining a normalized second location as a ratio of the vertical offset to a height of the reference bounding box; applying a transformation to the target bounding box to estimate locations of the facial features for the detected face in the target video frame, the transformation based on the normalized first and second locations; applying an effect to alter the target video frame based on the estimated locations of the facial features for the detected face in the target video frame to generate an altered video frame; and outputting the altered video frame. 2. The method of claim 1 , wherein applying the transformation comprises: determining a relative horizontal offset of a given facial feature within the target bounding box as a product of a width of the target bounding box and the normalized horizontal location; and offsetting the relative horizontal offset by a horizontal location of the target bounding box to determine an absolute horizontal position of the given facial feature; determining a relative vertical offset of the given facial feature within the target bounding box as a product of a height of the target bounding box and the normalized vertical location; and offsetting the relative vertical offset by a vertical location of the target bounding box to determine an absolute vertical position of the given facial feature. 3. The method of claim 1 , wherein applying the effect comprises: warping a mask image such that alignment points on the mask image align with the estimated locations of the facial features; and overlaying the mask image on the target frame. 4. The method of claim 3 , wherein applying the effect further comprises: predicting a rotation of the face based on a narrowing of a width of the target bounding box relative to a bounding box for prior video frame; and rotating the mask image based on the predicted rotation. 5. The method of claim 1 , wherein the facial features comprise eyes and a nose. 6. A non-transitory computer-readable storage medium storing instructions that when executed by a processor cause the processor to perform steps including: receiving a video comprising a sequence of video frames; detecting a face of a subject in a target video frame of the sequence of video frames; obtaining a target bounding box for the detected face in the target video frame, the target bounding box having edges aligning with detected outermost pixels of the detected face; obtaining a reference bounding box for the detected face in a reference video frame of the video; applying a feature detection algorithm to detect location of facial features for the detected face in the reference video frame; determining a horizontal offset of a first location of a given facial feature from a vertical edge of the reference bounding box; determining a normalized first location as a ratio of the horizontal offset to a width of the reference bounding box; determining a vertical offset of a first location of the given facial feature from a horizontal edge of the reference bounding box; determining a normalized second location as a ratio of the vertical offset to a height of the reference bounding box; applying a transformation to the target bounding box to estimate locations of the facial features for the detected face in the target video frame, the transformation based on the normalized first and second locations; applying an effect to alter the target video frame based on the estimated locations of the facial features for the detected face in the target video frame to generate an altered video frame; and outputting the altered video frame. 7. The non-transitory computer-readable storage medium of claim 6 , wherein applying the transformation comprises: determining a relative horizontal offset of a given facial feature within the target bounding box as a product of a width of the target bounding box and the normalized horizontal location; and offsetting the relative horizontal offset by a horizontal location of the target bounding box to determine an absolute horizontal position of the given facial feature; determining a relative vertical offset of the given facial feature within the target bounding box as a product of a height of the target bounding box and the normalized vertical location; and offsetting the relative vertical offset by a vertical location of the target bounding box to determine an absolute vertical position of the given facial feature. 8. The non-transitory computer-readable storage medium of claim 6 , wherein applying the effect comprises: warping a mask image such that alignment points on the mask image align with the estimated locations of the facial features; and overlaying the mask image on the target frame. 9. The non-transitory computer-readable storage medium of claim 8 , wherein applying the effect further comprises: predicting a rotation of the face based on a narrowing of a width of the target bounding box relative to a bounding box for prior video frame; and rotating the mask image based on the predicted rotation. 10. The non-transitory computer-readable storage medium of claim 6 , wherein the facial features comprise eyes and a nose. 11. A computer device comprising: a processor; and a non-transitory computer-readable storage medium storing instructions that when executed by a processor cause the processor to perform steps including: receiving a video comprising a sequence of video frames; detecting a face of a subject in a target video frame of the sequence of video frames; obtaining a target bounding box for the detected face in the target video frame, the target bounding box having edges aligning with detected outermost pixels of the detected face; obtaining a reference bounding box for the detected face in a reference video frame of the video; applying a feature detection algorithm to detect location of facial features for the detected face in the reference video frame; determining a horizontal offset of a first location of a given facial feature from a vertical edge of the reference bounding box; determining a normalized first location as a ratio of the horizontal offset to a width of the reference bounding box; determining a vertical offset of a first location of the given facial feature from a horizontal edge of the reference bounding box; determining a normalized second location as a ratio of the vertical offset to a height of the reference bounding box; applying a transformation to the target bounding box to estimate locations of the facial features for the detected face in the target video frame, the transformation based on the normalized first and second locations; applying an effect to alter the target video f

Assignees

Facebook Inc

Inventors

Classifications

H04N9/8715Primary
involving the mixing of the reproduced video signal with a non-recorded signal, e.g. a text signal · CPC title
G06V40/171
Local features and components; Facial parts (eye characteristics G06V40/18); Occluding parts, e.g. glasses; Geometrical relationships · CPC title
G06V40/169
Holistic features and representations, i.e. based on the facial image taken as a whole · CPC title
G06V20/20
in augmented reality scenes · CPC title
G06T11/60
Creating or editing images; Combining images with text · CPC title

Patent family

Related publications grouped by family.

View patent family 65808606

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10778939B2 cover?: An effects application receives a video of a face and detects a bounding box for each frame indicating the location and size of the face in each frame. In one or more reference frames. The application uses an algorithm to determine locations of facial features in the frame. The application then normalizes the feature locations relative to the bounding box and saves the normalized feature locati…
Who is the assignee on this patent?: Facebook Inc
What technology area does this patent fall under?: Primary CPC classification H04N9/8715. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).