Interactive image matting using neural networks

US11004208B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11004208-B2
Application numberUS-201916365213-A
CountryUS
Kind codeB2
Filing dateMar 26, 2019
Priority dateMar 26, 2019
Publication dateMay 11, 2021
Grant dateMay 11, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for deep neural network (DNN) based interactive image matting. A methodology implementing the techniques according to an embodiment includes generating, by the DNN, an alpha matte associated with an image, based on user-specified foreground region locations in the image. The method further includes applying a first DNN subnetwork to the image, the first subnetwork trained to generate a binary mask based on the user input, the binary mask designating pixels of the image as background or foreground. The method further includes applying a second DNN subnetwork to the generated binary mask, the second subnetwork trained to generate a trimap based on the user input, the trimap designating pixels of the image as background, foreground, or uncertain status. The method further includes applying a third DNN subnetwork to the generated trimap, the third subnetwork trained to generate the alpha matte based on the user input.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for image matting, the method comprising: generating, by a processor-based deep neural network (DNN), an alpha matte associated with an image, the generation based on a user-specified foreground region location in the image, wherein the generating includes generating a binary mask associated with the image based on the user-specified foreground region location, the image comprising image pixels, the binary mask designating the image pixels as at least one of background and foreground; generating a trimap associated with the image based on the generated binary mask and the user-specified foreground region location, the trimap designating the image pixels as at least one of background, foreground, or uncertain status; and generating the alpha matte based on the generated trimap and the user-specified foreground region location; and wherein the processor-based DNN comprises an encoder convolutional network to transform the binary mask to a set of image features and a decoder deconvolutional network to transform the set of image features to the trimap. 2. The method of claim 1 , wherein the alpha matte comprises alpha matte pixels, the alpha matte pixels corresponding to the image pixels and providing an indication of foreground percentage for the corresponding image pixels. 3. The method of claim 1 , wherein training of the DNN includes minimization of a loss function based on a comparison of alpha matte pixels generated from a training image and ground truth alpha matte pixels associated with the training image. 4. The method of claim 1 , wherein training of the DNN includes minimization of a loss function based on a comparison of gradients of an alpha matte generated from a training image and gradients in background regions of the training image. 5. The method of claim 1 , wherein the alpha matte is a first alpha matte, the method further comprising: performing down-sampling of the image from a first resolution to a second resolution prior to generating the binary mask, wherein the first resolution is higher than the second resolution; performing up-sampling of the first alpha matte from the second resolution to the first resolution, wherein the first alpha matte comprises affine parameter coefficients associated with the image pixels; and generating a second alpha matte as a linear combination of colors of the image at the first resolution, the linear combination employing the affine parameter coefficients of the first alpha matte. 6. The method of claim 1 , wherein the DNN is further configured to generate a foreground color decontamination map and/or a background color decontamination map, the foreground color decontamination map providing foreground color channels associated with pixels of the alpha matte, and the background color decontamination map providing background color channels associated with pixels of the alpha matte. 7. The method of claim 1 , wherein the user-specified foreground region location is specified by a mouse-based input or touchscreen-based input. 8. The method of claim 1 , wherein the DNN is implemented as a ResNet neural network or a VGG16 neural network. 9. A system for image matting, the system comprising: one or more processors to control and/or execute a deep neural network (DNN) configured to generate an alpha matte associated with an image, the image comprising image pixels, the generation based on a user-specified foreground region location in the image, wherein the DNN includes a first subnetwork configured to generate a binary mask associated with the image based on the user-specified foreground region location, the binary mask designating the image pixels as at least one of background and foreground; a second subnetwork configured to generate a trimap associated with the image based on the binary mask and the user-specified foreground region location, the trimap designating the image pixels as at least one of background, foreground, or uncertain status; and a third subnetwork configured to generate the alpha matte based on the trimap and the user-specified foreground region location; wherein the DNN is further configured to generate a foreground color decontamination map and/or a background color decontamination map, the foreground color decontamination map providing foreground color channels associated with pixels of the alpha matte and the background color decontamination map providing background color channels associated with the alpha matte pixels. 10. The system of claim 9 , wherein the alpha matte comprises alpha matte pixels, the alpha matte pixels corresponding to the image pixels and providing an indication of foreground percentage for the corresponding image pixels. 11. The system of claim 9 , wherein training of the DNN includes minimization of a loss function based on a comparison of alpha matte pixels generated from a training image and ground truth alpha matte pixels associated with the training image. 12. The system of claim 9 , wherein training of the DNN includes minimization of a loss function based on a comparison of gradients of an alpha matte generated from a training image and gradients in background regions of the training image. 13. The system of claim 9 , wherein the alpha matte is a first alpha matte, and the DNN further comprises: the one or more processors further configured to control and/or execute a down-sampling module to down-sample the image from a first resolution to a second resolution prior to operation of the first subnetwork, wherein the first resolution is higher than the second resolution; the one or more processors further configured to control and/or execute a coefficient up-sampling module to up-sample the first alpha matte from the second resolution to the first resolution, wherein the first alpha matte comprises affine parameter coefficients associated with the image pixels; and the one or more processors further configured to control and/or execute a high resolution alpha matte generation module to generate a second alpha matte as a linear combination of colors of the image at the first resolution, the linear combination employing the affine parameter coefficients of the first alpha matte. 14. The system of claim 9 , wherein the user-specified foreground region location is specified by mouse-based input or touchscreen-based input. 15. The system of claim 9 , wherein the DNN is implemented as a ResNet neural network or a VGG16 neural network. 16. A computer program product including one or more non-transitory machine-readable mediums encoded with instructions that when executed by one or more processors cause a process to be carried out for generating a trimap of an image, the process comprising: generating, by a deep neural network (DNN), a trimap associated with an image, the image comprising image pixels, the DNN configured to generate the trimap based on a binary mask associated with the image and on a user-specified foreground region location in the image, the trimap designating the image pixels as at least one of background, foreground, or uncertain status, wherein training of the DNN includes minimization of a loss function based on a comparison of (1) trimap pixels generated from a training image and an associated training binary mask, and (2) ground truth trimap pixels associated with the training image, and wherein the DNN comprises an encoder convolutional network to transform the binary mask to a set of image features and a decoder deconvolutional network to transform the set of image features to the trimap. 17. The computer program produ

Assignees

Inventors

Classifications

  • G06T7/11Primary

    Region-based segmentation · CPC title

  • G06T7/194Primary

    involving foreground-background segmentation · CPC title

  • Combinations of networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11004208B2 cover?
Techniques are disclosed for deep neural network (DNN) based interactive image matting. A methodology implementing the techniques according to an embodiment includes generating, by the DNN, an alpha matte associated with an image, based on user-specified foreground region locations in the image. The method further includes applying a first DNN subnetwork to the image, the first subnetwork train…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 11 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).