Similarity propagation for one-shot and few-shot image segmentation

US11367271B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11367271-B2
Application numberUS-202016906954-A
CountryUS
Kind codeB2
Filing dateJun 19, 2020
Priority dateJun 19, 2020
Publication dateJun 21, 2022
Grant dateJun 21, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present invention provide systems, methods, and computer storage media for one-shot and few-shot image segmentation on classes of objects that were not represented during training. In some embodiments, a dual prediction scheme may be applied in which query and support masks are jointly predicted using a shared decoder, which aids in similarity propagation between the query and support features. Additionally or alternatively, foreground and background attentive fusion may be applied to utilize cues from foreground and background feature similarities between the query and support images. Finally, to prevent overfitting on class-conditional similarities across training classes, input channel averaging may be applied for the query image during training. Accordingly, the techniques described herein may be used to achieve state-of-the-art performance for both one-shot and few-shot segmentation tasks.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more computer storage media storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform operations comprising: extracting, using a shared encoder, query features from a query image and support features from a support image; generating, based on the support features and a support mask identifying pixels of the support image in a support class, a probe representing features of the support image; and jointly predicting, based on the probe, the query features, and the support features, a query mask and a support mask representing pixels in the support class. 2. The one or more computer storage media of claim 1 , wherein generating the probe comprises generating a foreground probe representing foreground features of the support image and a background probe representing background features of the support image. 3. The one or more computer storage media of claim 1 , the operations further comprising downsampling the support mask to generate a downsampled support mask, wherein generating the probe comprises performing masked average pooling of the support features using the downsampled support mask. 4. The one or more computer storage media of claim 1 , wherein jointly predicting the query mask and the support mask comprises: using a shared fusion network to fuse the features of the support image with the query features to generate fused query features; using the shared fusion network to fuse the features of the support image with the support features to generate fused support features; using a shared decoder to decode the fused query features into the query mask; and using the shared decoder to decode the fused support features into the support mask. 5. The one or more computer storage media of claim 1 , wherein jointly predicting the query mask and the support mask comprises predicting the query mask using a query branch of a similarity propagation network and predicting the support mask using a support branch of the similarity propagation network. 6. The one or more computer storage media of claim 1 , the operations further comprising performing a batch edit on a collection of target images, based on an edit to the support image involving a region identified by the support mask, by predicting a corresponding region for each of the target images based on the probe and the support features. 7. The one or more computer storage media of claim 1 , the operations further comprising generating a plurality of support probes, one for each of a plurality of support images, and averaging the support probes to generate the probe. 8. The one or more computer storage media of claim 1 , wherein the operations are of a similarity propagation network trained using input channel averaging by converting ground truth query images to greyscale with a probability that decays as training progresses. 9. A computerized method comprising: generating, based on a support image and a support mask identifying pixels of a support class, a foreground probe representing foreground features of the support image and a background probe representing background features of the support image; probing extracted query features of a query image with the foreground probe and the background probe to generate a foreground attention map and a background attention map for the query image; fusing the foreground attention map, the background attention map, and the extracted query features to generate fused query features; and decoding the fused query features to predict a first representation of pixels of the query image in the support class. 10. The computerized method of claim 9 , the method further comprising jointly predicting the first representation of the pixels of the query image in the support class and a second representation of pixels of the support image in the support class. 11. The computerized method of claim 9 , the method further comprising downsampling the support mask to generate a downsampled support mask, wherein generating the foreground probe and the background probe is based on the downsampled support mask. 12. The computerized method of claim 9 , the method further comprising downsampling the support mask to generate a downsampled support mask, wherein generating the foreground probe comprises performing masked average pooling of the foreground features of the support image using the downsampled support mask, and wherein generating the background probe comprises performing masked average pooling of the background features of the support image using the downsampled support mask. 13. The computerized method of claim 9 , the method further comprising: using a shared fusion network to fuse the foreground features and the background features of the support image with the extracted query features to generate the fused query features; using the shared fusion network to fuse the foreground features and the background features of the support image with support features of the support image to generate fused support features; using a shared decoder to perform the decoding of the fused query features into the first representation of the pixels of the query image; and using the shared decoder to decode the fused support features into a second representation of pixels of the support image in the support class. 14. The computerized method of claim 9 , the method further comprising performing a batch edit on a collection of target images, based on an edit to the support image involving a region identified by the support mask, by predicting a corresponding region for each of the target images based on the foreground probe, the background probe, and the support features. 15. The computerized method of claim 9 , the method further comprising generating a foreground support probe and a background support probe for each of a plurality of support images, averaging the foreground support probe for each of the support images to generate the foreground support probe, and averaging the background support probe for each of the support images to generate the background support probe. 16. The computerized method of claim 9 , wherein the method is performed by a similarity propagation network trained using input channel averaging by converting ground truth query images to greyscale with a probability that decays as training progresses. 17. A computer system comprising: one or more hardware processors and memory configured to provide computer program instructions to the one or more hardware processors; a feature extraction module configured to use the one or more hardware processors to extract query features from a query image and support features from a support image; an attentive fusion module configured to use the one or more hardware processors to fuse foreground information and background information from the support image with (i) the query features to generate fused query features, and (ii) the support features to generate fused support features; and a dual mask prediction module configured to use the one or more hardware processors to jointly predict, based on the fused query features and the fused support features, a query mask and a support mask representing pixels in the support class. 18. The computer system of claim 17 , further comprising an edit propagation tool configured to use the one or more hardware processors perform a batch edit on a collection of target images, based on an edit to the support image involving a region identified by the support mask, by triggering a predicti

Assignees

Inventors

Classifications

  • Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • Proximity, similarity or dissimilarity measures · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11367271B2 cover?
Embodiments of the present invention provide systems, methods, and computer storage media for one-shot and few-shot image segmentation on classes of objects that were not represented during training. In some embodiments, a dual prediction scheme may be applied in which query and support masks are jointly predicted using a shared decoder, which aids in similarity propagation between the query an…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 21 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).