What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 24 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Instance-level semantic segmentation system

US10424064B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10424064-B2
Application number	US-201615296845-A
Country	US
Kind code	B2
Filing date	Oct 18, 2016
Priority date	Oct 18, 2016
Publication date	Sep 24, 2019
Grant date	Sep 24, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Certain aspects involve semantic segmentation of objects in a digital visual medium by determining a score for each pixel of the digital visual medium that is representative of a likelihood that each pixel corresponds to the objects associated with bounding boxes within the digital visual medium. An instance-level label that yields a label for each of the pixels of the digital visual medium corresponding to the objects is determined based, in part, on a collective probability map including the score for each pixel of the digital visual medium. In some aspects, the score for each pixel corresponding to each bounding box is determined by a prediction model trained by a neural network.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for semantic segmentation of one or more objects in a digital visual medium, comprising: accessing, by a processing device, a set of bounding boxes potentially corresponding to a set of target objects within the digital visual medium; for each of the set of bounding boxes, determining, by the processing device, a pixel score for each pixel of the digital visual medium corresponding to the set of bounding boxes, the pixel score being representative of a likelihood that each pixel corresponds to the set of target objects associated with the set of bounding boxes; determining, by the processing device and for each pixel of the digital visual medium, an instance-level label that distinguishes a first set of pixels corresponding to a first object from a second set of pixels corresponding to a second object of a same class as the first object, each instance-level label determined based, at least in part, on a collective probability map including the pixel score for each pixel; and applying, by the processing device, at least some of the determined instance-level labels to at least some of the pixels of the digital visual medium. 2. The computer-implemented method of claim 1 , wherein determining the pixel score comprises employing a prediction model trained by a neural network. 3. The computer-implemented method of claim 2 , wherein the method further comprises training the neural network, said training comprising: receiving, by the processing device, a training visual medium having a first bounding box corresponding to a training target object within the training visual medium; generating, by the processing device and based on the first bounding box, a plurality of bounding boxes corresponding to the training target object within the training visual medium, the first bounding box and the plurality of bounding boxes together forming a training set of bounding boxes; generating, by the processing device, a plurality of distance maps, each distance map in the plurality of distance maps corresponding to a respective bounding box of the training set of bounding boxes; concatenating, by the processing device, the training visual medium with each distance map in the plurality of distance maps to generate a plurality of training pairs; and training, by the processing device and based on at least one training pair of the plurality of training pairs, the neural network to segment pixels of the training visual medium corresponding to the training target object. 4. The computer-implemented method of claim 3 , wherein the neural network is a convolutional encoder-decoder network including: a convolutional encoder network having one or more convolutional layers for training filters to recognize one or more features of the one or more target objects, and one or more pooling layers for manipulating a spatial size of the at least one training pair; and a convolutional decoder network having one or more deconvolutional layers and one or more unpooling layers for reconstructing details of the digital visual medium, wherein training the neural network based on the at least one training pair includes inputting the at least one training pair to the convolutional encoder network and the convolutional decoder network to generate a binary instance mask corresponding to the training target object. 5. The computer-implemented method of claim 1 , wherein the set of bounding boxes is received based on an object detection algorithm, wherein receiving the set of bounding boxes includes receiving class scores associated with the set of bounding boxes. 6. The computer-implemented method of claim 1 , wherein the set of bounding boxes is received based on an object detection algorithm, wherein class scores corresponding to the set of bounding boxes are received based on a classification algorithm. 7. The computer-implemented method of claim 1 , wherein the collective probability map is generated based on a plurality of probability maps for each bounding box of the set of bounding boxes, wherein each probability map of the plurality of probability maps is weighted based on class scores corresponding to each bounding box. 8. The computer-implemented method of claim 1 , wherein determining the instance-level label includes using probabilities of the collection probability map to identify a compatibility between adjacent pixels corresponding to at least one of the set of target objects, the compatibility being identified using a conditional random field model. 9. A computing system for semantic segmentation of one or more objects in a digital visual medium, the computing system comprising: means for storing a plurality of digital media, the digital media including a digital visual medium having a bounding box set, the bounding box set including at a first bounding box potentially corresponding to a target object within the digital visual medium and a second bounding box potentially corresponding to a second target object within the digital visual medium; and means for determining, for each bounding box in the bounding box set, a pixel score for each pixel of the digital visual medium corresponding to each bounding box of the bounding box set, the pixel score being representative of a likelihood that each pixel corresponds to the target object associated with the at least one bounding box, said means being communicatively coupled to the means for storing the plurality of digital media; means for determining for each pixel of the digital visual medium, an instance-level label that distinguishes a first set of pixels corresponding to the first bounding box from a second set of pixels corresponding the second bounding box, each instance-level label determined based, at least in part, on a collective probability map including the pixel score for each pixel; and means for assigning at least some of the determined instance-level labels to at least some of the pixels in the digital visual medium. 10. The computing system of claim 9 , wherein the means for determining the pixel score includes a neural network and a prediction model trained by the neural network. 11. The computing system of claim 10 , further comprising a means for training the neural network by performing operations comprising: generating, based a training visual medium having a training target object and a first bounding box corresponding to the training target object, a plurality of bounding boxes corresponding to the training target object, the first bounding box and the plurality of bounding boxes together forming a training set of bounding boxes; generating a plurality of distance maps, each distance map in the plurality of distance maps corresponding to a respective bounding box of the training set of bounding boxes; concatenating the training visual medium with each distance map in the plurality of distance maps to generate a plurality of training pairs; and training, based on at least one training pair of the plurality of training pairs, the neural network to segment pixels of the training visual medium corresponding to the training target object. 12. The computing system of claim 11 , wherein the neural network is a convolutional encoder-decoder network including: a convolutional encoder network having one or more convolutional layers for training filters to recognize one or more features of the target object and one or more pooling layers for manipulating a spatial size of the at least one training pair; and a convolutional decoder network having one or more deconvolutional layers and one or more unpooling layers for reconstructing details of the digital visual medium.

Assignees

Adobe Inc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/08Primary
Learning methods · CPC title
G06T7/11Primary
Region-based segmentation · CPC title
G06F16/48
Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title
G06T7/12
Edge-based segmentation · CPC title

Patent family

Related publications grouped by family.

View patent family 59771725

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10424064B2 cover?: Certain aspects involve semantic segmentation of objects in a digital visual medium by determining a score for each pixel of the digital visual medium that is representative of a likelihood that each pixel corresponds to the objects associated with bounding boxes within the digital visual medium. An instance-level label that yields a label for each of the pixels of the digital visual medium cor…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 24 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and apparatus for neural network training and construction and method and apparatus for object detection

Image processing apparatus and method based on deep learning and neural network learning

Generic object detection in images

Frequently asked questions