Utilizing interactive deep learning to select objects in digital visual media

US10192129B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10192129-B2
Application numberUS-201514945245-A
CountryUS
Kind codeB2
Filing dateNov 18, 2015
Priority dateNov 18, 2015
Publication dateJan 29, 2019
Grant dateJan 29, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed for selecting target objects within digital images. In particular, in one or more embodiments, the disclosed systems and methods generate a trained neural network based on training digital images and training indicators. Moreover, one or more embodiments of the disclosed systems and methods utilize a trained neural network and iterative user indicators to select targeted objects in digital images. Specifically, the disclosed systems and methods can transform user indicators into distance maps that can be utilized in conjunction with color channels and a trained neural network to identify pixels that reflect the target object.

First claim

Opening claim text (preview).

We claim: 1. In a digital medium environment for editing digital visual media, a method of interactively selecting digital objects represented within digital visual media using deep learning, the method comprising: accessing a neural network trained utilizing a repository of digital training images having target objects, training ground truth masks of the target objects, training indicators, and training distance maps, each training distance map reflecting distances between a corresponding training indicator and pixels of a corresponding digital training image; identifying a user indicator with regard to a probe digital image, the user indicator comprising one or more pixels of the probe digital image identified by a user and an indication of how the one or more pixels correspond to a target object represented in the probe digital image; generating a distance map reflecting distances between the user indicator and pixels of the probe digital image; and identifying a set of pixels representing the target object in the probe digital image by providing the probe digital image, the user indicator, and the generated distance map as input to the trained neural network. 2. The method of claim 1 , wherein identifying the set of pixels representing the target object in the probe digital image further comprises generating a probability map by utilizing the trained neural network based on the user indicator, the probability map comprising values that reflect a likelihood that the set of pixels correspond to the target object. 3. The method of claim 2 , further comprising, comparing the values of the probability map to a threshold, wherein the method identifies the set of pixels representing the target object in the probe digital image based on the comparison. 4. The method of claim 1 , wherein: the user indicator comprises one or more of: user input of a point corresponding to a pixel, user input of a stroke corresponding to a plurality of pixels, user input of a bounding box encompassing a plurality of pixels, or user input of a boundary corresponding to a plurality of pixels; and the user indicator comprises the indication of how the one or more pixels correspond to the target object in the probe digital image, the indication comprising one or more of: an indication that the target object in the probe digital image is encompassed by the one or more pixels, an indication that the one or more pixels are within the target object in the probe digital image, an indication that the one or more pixels are outside the target object in the probe digital image, or an indication that the one or more pixels are near the target object in the probe digital image. 5. The method of claim 1 , wherein the training indicators comprise: a positive training indicator, the positive training indicator comprising at least one pixel of a digital training image, wherein the digital training image comprises an identified target object, and the at least one pixel is part of the identified target object; and a negative training indicator, the negative training indicator comprising at least one background pixel of the digital training image, wherein the at least one background pixel is not part of the identified target object. 6. The method of claim 1 , further comprising generating the training indicators by: generating a first negative training indicator comprising a first pixel of a digital training image by randomly sampling the first pixel from a first plurality of pixels that are not part of an identified target object in the digital training image; generating a second negative training indicator comprising a second pixel of the digital training image by randomly sampling the second pixel from a second plurality of pixels that are part of an untargeted object in the digital training image; or generating a third negative training indicator comprising a third pixel of the digital training image, by sampling the third pixel from the first plurality of pixels that are not part of the identified target object based on a distance between the third pixel and another negative training indicator. 7. The method of claim 1 , further comprising generating a training distance map for a digital training image from the repository of digital training images, wherein the digital training image comprises an identified target object, wherein a training indicator comprises an indicated pixel of the digital training image, and wherein the training distance map comprises distances between pixels in the digital training image and the indicated pixel. 8. The method of claim 5 , further comprising providing the neural network with a positive training distance map, a negative training distance map, and a color channel, wherein the positive training distance map reflects a distance between a pixel in a digital training image and a positive training indicator, the negative training distance map reflects a distance between a pixel in the digital training image and a negative training indicator, and the color channel reflects a color of a pixel in the digital training image. 9. The method of claim 1 , further comprising: identifying a second user indicator with regard to the probe digital image, the second user indicator comprising a second group of one or more pixels from the probe digital image identified by the user; generating a second distance map reflecting distances between the second user indicator and the pixels of the probe digital image; and generating a second set of pixels representing the target object in the probe digital image based on the second distance map. 10. The method of claim 1 , wherein the probe digital image is part of a digital video, wherein the probe digital image is followed sequentially by a second digital image in the digital video, wherein the second digital image comprises a modified target object corresponding to the target object represented in the probe digital image, and the method further comprising: utilizing the trained neural network, identifying a second set of pixels representing the modified target object in the second digital image based on the identified set of pixels representing the target object in the probe digital image. 11. A non-transitory computer readable medium storing instructions that, when executed by at least one processor, cause a computer system to: access a neural network trained utilizing digital training images having target objects, training ground truth masks of the target objects, training indicators, and training distance maps, each training distance map reflecting distances between a corresponding training indicator and pixels of a corresponding digital training image; identify a user indicator with regard to a probe digital image, the user indicator comprising one or more pixels of the probe digital image identified by a user and an indication of how the one or more pixels correspond to a target object represented in the probe digital image; generate a distance map reflecting distances between the user indicator and pixels of the probe digital image; and identify a set of pixels representing the target object in the probe digital image by providing the probe digital image, the user indicator, and the generated distance map as input to the trained neural network. 12. The non-transitory computer readable medium of claim 11 , wherein: the user indicator comprises a positive user indicator comprising at least one pixel that is part of the target object; and the distance map reflects distances between the at least one pixel that is part of the target object and pixels of the probe digital image. 13. The non-transitory computer readable me

Assignees

Inventors

Classifications

  • Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Combinations of networks · CPC title

  • by interactive preprocessing or interactive shape modelling, e.g. feature points assigned by a user · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10192129B2 cover?
Systems and methods are disclosed for selecting target objects within digital images. In particular, in one or more embodiments, the disclosed systems and methods generate a trained neural network based on training digital images and training indicators. Moreover, one or more embodiments of the disclosed systems and methods utilize a trained neural network and iterative user indicators to selec…
Who is the assignee on this patent?
Adobe Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 29 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).