Content adaptive background foreground segmentation for video coding

US9584814B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9584814-B2
Application numberUS-201414277897-A
CountryUS
Kind codeB2
Filing dateMay 15, 2014
Priority dateMay 15, 2014
Publication dateFeb 28, 2017
Grant dateFeb 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques related to content adaptive background-foreground segmentation for video coding.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of background foreground segmentation for video coding, comprising: learning a background model of a base frame comprising: accumulating frame difference magnitudes between the base frame and each of a plurality of reference frames of a video sequence forming a scene; comparing individual difference magnitudes to a plurality of activity thresholds to determine whether a pixel or block of pixels associated with the difference magnitude is active or inactive; forming a plurality of cumulative segmentation masks that each are the amount of pixels or blocks of pixels with the same total of the number of frame reference-activity threshold combinations in which the pixels or block of pixels are active, wherein the frame reference-activity threshold combination is one reference frame and one activity threshold used with a difference magnitude and available from the plurality of activity thresholds and the plurality of reference frames: and determining the background threshold used to assign the blocks or pixels to a background or a foreground of the base frame by using the cumulative segmentation masks. 2. The method of claim 1 comprising determining a learning rate of the scene depending on the complexity of the scene and to select the plurality of reference frames. 3. The method of claim 2 comprising selecting the plurality of reference frames that are at least generally farther along in the video sequence from the base frame the less complex the scene. 4. The method of claim 1 wherein there are four or five reference frames selected for each defined level of complexity and eight activity thresholds. 5. The method of claim 1 wherein determining the background threshold comprises selecting, as the background threshold, a minimum cumulative segmentation mask between two maximum cumulative segmentation masks along a numerical ordering of the cumulative segmentation masks by number of frame reference-activity threshold combinations in which the pixels or block of pixels are active. 6. The method of claim 5 wherein determining the background threshold comprises using a histogram to observe the maximum cumulative segmentation masks as peaks on the histogram and the minimum cumulative segmentation mask as the lowest valley between the peaks on the histogram. 7. The method of claim 5 wherein blocks with cumulative segmentation mask totals above the background threshold are foreground blocks, and blocks with cumulative segmentation mask totals below the background threshold are background blocks. 8. The method of claim 1 comprising determining pixel-accurate segmentation on a frame comprising finding a minimum difference between (1) a binarized frame based on the cumulative segmentation mask values and the background threshold, and (2) the frame binarized using one of the frame reference-activity threshold combinations. 9. The method of claim 1 comprising: determining a learning rate of the scene depending on the complexity of the scene and to select the plurality of reference frames; selecting the plurality of reference frames that are at least generally farther along in the video sequence from the base frame the less complex the scene; wherein there are four or five reference frames selected for each defined level of complexity and eight activity thresholds; wherein determining the background threshold comprises selecting, as the background threshold, a minimum cumulative segmentation mask between two maximum cumulative segmentation masks along a numerical ordering of the cumulative segmentation masks by number of frame reference-activity threshold combinations in which the pixels or block of pixels are active; wherein determining the background threshold comprises using a histogram to observe the maximum cumulative segmentation masks as peaks on the histogram and the minimum cumulative segmentation mask as the lowest valley between the peaks on the histogram; wherein blocks with cumulative segmentation mask totals above the background threshold are foreground blocks, and blocks with cumulative segmentation mask totals below the background threshold are background blocks; and determining pixel-accurate segmentation on a frame comprising finding a minimum difference between (1) a binarized frame based on the cumulative segmentation mask values and the background threshold, and (2) the frame binarized using one of the frame reference-activity threshold combinations. 10. A method of background-foreground segmentation for video coding comprising: learning a background model of a base frame comprising: accumulating frame difference magnitudes between the base frame and each of a plurality of reference frames of a video sequence forming a scene: comparing individual difference magnitudes to a plurality of activity thresholds to determine whether a pixel or block of pixels associated with the difference magnitude is active or inactive: forming a plurality of cumulative segmentation masks that each are the amount of pixels or blocks of pixels with the same total of the number of frame reference-activity threshold combinations in which the pixels or block of pixels are active, wherein the frame reference-activity threshold combination is one reference frame and one activity threshold used with a difference magnitude and available from the plurality of activity thresholds and the plurality of reference frames; and determining the background threshold used to assign the blocks or pixels to a background or a foreground of the base frame by using the cumulative segmentation masks; determining a background-foreground segmentation threshold for a current frame separately from the background model; forming a current segmentation mask by comparing the segmentation threshold to a difference between the current frame and the background model; applying morphological opening and closing to adjust background or foreground assignment of pixels or blocks on the segmentation mask; determining new uncovered background; updating the segmentation mask with the new uncovered background; and updating the background model with the new uncovered background. 11. The method of claim 10 wherein determining the background-foreground segmentation threshold comprises performing linear regression. 12. The method of claim 10 wherein the current segmentation mask is in binarized form. 13. The method of claim 10 comprising: updating the segmentation mask comprising using recovered background to modify the segmentation mask; and cleaning the segmentation mask by removing spikes and blobs to form a final segmentation mask. 14. The method of claim 10 wherein updating the background model comprises updating a count of pixels in the background, and updating an average pixel value associated with each background pixel location. 15. The method of claim 10 wherein applying morphological opening and closing comprises using a 2×2 support region as a sliding window, and changing one of the locations in the support region between background and foreground and depending on the background or foreground assignment at at least one of the other locations in the support region. 16. The method of claim 10 , wherein determining new uncovered background comprises: creating a region of interest (ROI) around the current foreground-background boundary spliting ROI into parts; marking low energy areas inside of the ROI parts; and classifying low energy associated with background. 17. The method of claim 16 wherein marking low energy areas comprises forming a

Assignees

Inventors

Classifications

  • Video; Image sequence · CPC title

  • Image subtraction · CPC title

  • with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic · CPC title

  • using video object coding · CPC title

  • Morphological image processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9584814B2 cover?
Techniques related to content adaptive background-foreground segmentation for video coding.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification H04N19/146. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).