Automatic composition of video with dynamic background and composite frames selected based on frame and foreground object criteria

US10051206B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10051206-B2
Application numberUS-201615080292-A
CountryUS
Kind codeB2
Filing dateMar 24, 2016
Priority dateSep 28, 2015
Publication dateAug 14, 2018
Grant dateAug 14, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processing device generates composite images from a sequence of images. The composite images may be used as frames of video. A foreground/background segmentation is performed at selected frames to extract a plurality of foreground object images depicting a foreground object at different locations as it moves across a scene. The foreground object images are stored to a foreground object list. The foreground object images in the foreground object list are overlaid onto subsequent video frames that follow the respective frames from which they were extracted, thereby generating a composite video.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating a composite output video from an input video having a sequence of frames, the method comprising: selecting from the sequence of frames, a range of frames for processing; training a predictive model based on a plurality of training video frames, the predictive model determining whether a pixel in a given video frame belongs to a background model or foreground object; performing, by a processing device, a foreground/background segmentation on each of the frames in the range of frames to extract a plurality of candidate foreground object images based on the predictive model, each of the candidate foreground object images comprising a representation of the foreground object depicted in a corresponding video frame with background pixels subtracted; selecting, based on an image metric, a selected foreground object image from the plurality of candidate foreground object images; storing the selected foreground object image to a foreground object list; overlaying the stored foreground object image in the foreground object list on a current video frame to generate a composite video frame; determining if a frame number of the current video frame is a multiple of a predefined integer X and responsive to the frame number of the current video frame being the multiple of the predefined integer X, updating the predictive model. 2. The method of claim 1 , wherein selecting the selected foreground object image comprises: determining an image quality metric for each of the candidate foreground object images; and determining that the selected foreground object image has a highest quality metric. 3. The method of claim 1 , wherein selecting the selected foreground object image comprises: determining a face detection likelihood on each of the candidate foreground object images; and determining that the selected foreground object image has a highest face detection likelihood. 4. The method of claim 1 , wherein selecting the selected foreground object image comprises: determining a motion parameter for each of the candidate foreground object images; and determining that the selected foreground object image has a motion parameter best matching a predefined motion criteria. 5. The method of claim 1 , wherein performing the foreground/background segmentation comprises: obtaining a preliminary foreground object image; applying a filter to reduce noise in the preliminary foreground object image to generate a filtered image; detecting a filled convex hull region in the preliminary foreground object image; adding extra pixels from the filtered image to the preliminary foreground object image to generate a temporary image; discarding pixels in the temporary image outside the filled convex hull region to generate a noisy convex hull image; and closing gaps in foreground regions of the noisy convex hull image to generate the foreground object image. 6. The method of claim 1 , wherein the predictive model comprises an adaptive Gaussian Mixture Model. 7. A non-transitory computer-readable storage medium storing instructions for generating a composite output video from an input video having a sequence of frames, the instructions when executed by a processor causing the processor to perform steps comprising: selecting from the sequence of frames, a range of frames for processing; performing a foreground/background segmentation on each of the frames in the range of frames to extract a plurality of candidate foreground object images based on a predictive model, each of the candidate foreground object images comprising a representation of a foreground object depicted in a corresponding video frame with background pixels subtracted; selecting, based on an image metric, a selected foreground object image from the plurality of candidate foreground object images; storing the selected foreground object image to a foreground object list; and overlaying the stored foreground object image in the foreground object list on a current video frame to generate a composite video frame; wherein the performing of the foreground/background segmentation comprises: obtaining a preliminary foreground object image; applying a filter to reduce noise in the preliminary foreground object image to generate a filtered image; detecting a filled convex hull region in the preliminary foreground object image; adding extra pixels from the filtered image to the preliminary foreground object image to generate a temporary image; discarding pixels in the temporary image outside the filled convex hull region to generate a noisy convex hull image; and closing gaps in foreground regions of the noisy convex hull image to generate the foreground object image. 8. The non-transitory computer-readable storage medium of claim 7 , wherein selecting the selected foreground object image comprises: determining an image quality metric for each of the candidate foreground object images; and determining that the selected foreground object image has a highest quality metric. 9. The non-transitory computer-readable storage medium of claim 7 , wherein selecting the selected foreground object image comprises: determining a face detection likelihood on each of the candidate foreground object images; and determining that the selected foreground object image has a highest face detection likelihood. 10. The non-transitory computer-readable storage medium of claim 7 , wherein selecting the selected foreground object image comprises: determining a motion parameter for each of the candidate foreground object images; and determining that the selected foreground object image has a motion parameter best matching a predefined motion criteria. 11. The non-transitory computer-readable storage medium of claim 7 , wherein the instructions when executed further cause the processor to perform a step of: training the predictive model based on a plurality of training video frames, the predictive model to predict whether a pixel in a given video frame belongs to a background model or the foreground object. 12. The non-transitory computer-readable storage medium of claim 11 , wherein the instructions when executed further cause the processor to perform the steps of: determining if a frame number of the current video frame is a multiple of a predefined integer X; and responsive to the frame number of the current video frame being the multiple of the predefined integer X, updating the predictive model. 13. The non-transitory computer-readable storage medium of claim 7 , wherein the predictive model comprises an adaptive Gaussian Mixture Model. 14. A camera apparatus comprising: one or more processor apparatus; and a non-transitory computer-readable storage medium configured to store instructions for generating a composite output video from an input video having a sequence of frames, the instructions being configured to, when executed by the one or more processor apparatus, cause the camera apparatus to: select from the sequence of frames, a range of frames for processing; perform a foreground/background segmentation on each of the frames in the range of frames to extract a plurality of candidate foreground object images based on a predictive model, each of the candidate foreground object images comprising a representation of a foreground object depicted in a corresponding video frame with background pixels subtracted; select, based on an image metric, a selected foreground object image from the plurality of candidate foreground object images; store the selected foreground object image to a foreground object list; overlay the s

Assignees

Inventors

Classifications

  • Bracketing, i.e. taking a series of images with varying exposure conditions · CPC title

  • Stereoscopic video; Stereoscopic image sequence · CPC title

  • involving subtraction of images · CPC title

  • involving the use of two or more images · CPC title

  • involving foreground-background segmentation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10051206B2 cover?
A processing device generates composite images from a sequence of images. The composite images may be used as frames of video. A foreground/background segmentation is performed at selected frames to extract a plurality of foreground object images depicting a foreground object at different locations as it moves across a scene. The foreground object images are stored to a foreground object list. …
Who is the assignee on this patent?
Gopro Inc
What technology area does this patent fall under?
Primary CPC classification H04N5/272. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Aug 14 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).