Identifying target objects using scale-diverse segmentation neural networks

US2020202533A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020202533-A1
Application numberUS-201816231746-A
CountryUS
Kind codeA1
Filing dateDec 24, 2018
Priority dateDec 24, 2018
Publication dateJun 25, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing scale-diverse segmentation neural networks to analyze digital images at different scales and identify different target objects portrayed in the digital images. For example, in one or more embodiments, the disclosed systems analyze a digital image and corresponding user indicators (e.g., foreground indicators, background indicators, edge indicators, boundary region indicators, and/or voice indicators) at different scales utilizing a scale-diverse segmentation neural network. In particular, the disclosed systems can utilize the scale-diverse segmentation neural network to generate a plurality of semantically meaningful object segmentation outputs. Furthermore, the disclosed systems can provide the plurality of object segmentation outputs for display and selection to improve the efficiency and accuracy of identifying target objects and modifying the digital image.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer system to: identify a user indicator comprising one or more pixels of a digital image, the digital image portraying one or more target objects; utilize a scale-diverse segmentation neural network to generate a first object segmentation output at a first scale based the digital image and the user indicator; utilize the scale-diverse segmentation neural network to generate a second object segmentation output at a second scale based the digital image and the user indicator; and provide the first object segmentation output and the second object segmentation output for display. 2 . The non-transitory computer-readable medium of claim 1 , wherein the scale-diverse segmentation neural network comprises a plurality of output channels corresponding to a plurality of scales and further comprising instructions that, when executed by the at least one processor, cause the computer system to: utilize a first output channel corresponding to the first scale to generate the first object segmentation output; and utilize a second output channel corresponding to the second scale to generate the second object segmentation output. 3 . The non-transitory computer-readable medium of claim 1 , wherein the first scale comprises a first size and a first aspect ratio and the second scale comprises a second size and a second aspect ratio. 4 . The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: provide a scale slider user interface element for display; in response to identifying user input of a first position corresponding to the first scale via the scale slider user interface element, provide the first object segmentation output for display; and in response to identifying user input of a second position corresponding to the second scale via the scale slider user interface element, provide the second object segmentation output for display. 5 . The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer system to perform at least one of: analyze the digital image and the user indicator utilizing a scale proposal neural network to generate the first scale and the second scale; or determine the first scale based on an amount of time of a user interaction. 6 . The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: apply an object verification model of the scale-diverse segmentation neural network to determine a first object score corresponding to the first scale and a second object score corresponding to the second scale; and provide the first object segmentation output and the second object segmentation output for display based on the first object score and the second object score. 7 . The non-transitory computer-readable medium of claim 1 , wherein the first object segmentation output comprises at least one of: a segmentation mask or a segmentation boundary. 8 . The non-transitory computer-readable medium of claim 1 , further comprising instructions that, when executed by the at least one processor, cause the computer system to: identify user input selecting the first object segmentation output; and select pixels of the digital image corresponding to the one or more target objects based on the user input selecting the first object segmentation output. 9 . A system comprising: at least one processor; at least one non-transitory computer-readable storage medium comprising: a training digital image portraying a training object; one or more training indicators corresponding to the training object; a first ground truth segmentation corresponding to a first scale, the training object, and the one or more training indicators; and instructions that, when executed by the at least one processor, cause the system to train a scale-diverse segmentation neural network by: utilizing the scale-diverse segmentation neural network to generate a first predicted object segmentation output based on the training digital image and the one or more training indicators at the first scale; and modifying tunable parameters of the scale-diverse segmentation neural network based on a comparison of the first predicted object segmentation output with the first ground truth segmentation corresponding to the first scale, the training object, and the one or more training indicators. 10 . The system of claim 9 , wherein the scale-diverse segmentation neural network comprises a plurality of output channels corresponding to a plurality of scales and further comprising instructions that, when executed by the at least one processor, cause the system to utilize a first output channel corresponding to the first scale to generate the first predicted object segmentation output. 11 . The system of claim 9 , wherein the at least one non-transitory computer-readable storage medium further comprises a second ground truth segmentation corresponding to a second scale, the training object, and the one or more training indicators; and further comprising instructions that, when executed by the at least one processor, cause the system to train the scale-diverse segmentation neural network by: utilizing the scale-diverse segmentation neural network to generate a second predicted object segmentation output based on the training digital image and the one or more training indicators at the second scale; and comparing the second predicted object segmentation output with the second ground truth segmentation. 12 . The system of claim 9 , wherein the training object comprises a first object and a second object and the one or more training indicators comprises an ambiguous training indicator in relation to the training object and the first object. 13 . The system of claim 12 , further comprising instructions that, when executed by the at least one processor, cause the system to generate the ambiguous training indicator by: identifying a common foreground for the training object and the first object; and sampling the ambiguous training indicator from the common foreground for the training object and the first object. 14 . The system of claim 12 , wherein the one or more training indicators comprises the ambiguous training indicator and a definitive training indicator and further comprising instructions that, when executed by the at least one processor, cause the system to generate the definitive training indicator by sampling a positive definitive training indicator from a region of the digital image corresponding to the first ground truth segmentation. 15 . The system of claim 9 , further comprising instructions that, when executed by the at least one processor, cause the system to compare the first ground truth segmentation to a plurality of scales to determine that the first scale corresponds to the first ground truth segmentation. 16 . In a digital medium environment for editing digital visual media, a computer-implemented method of identifying digital objects portrayed within the digital visual media using scale variant deep learning, the method comprising: a step for training a scale-diverse segmentation neural network to analyze training indicators corresponding to training digital images and generate object segmentation outputs correspondi

Assignees

Inventors

Classifications

  • by interactive preprocessing or interactive shape modelling, e.g. feature points assigned by a user · CPC title

  • Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020202533A1 cover?
The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and utilizing scale-diverse segmentation neural networks to analyze digital images at different scales and identify different target objects portrayed in the digital images. For example, in one or more embodiments, the disclosed systems analyze a digital image and corresponding user indic…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 25 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).