Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06T5/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Image semantic segmentation

US9865042B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9865042-B2
Application number	US-201514801839-A
Country	US
Kind code	B2
Filing date	Jul 17, 2015
Priority date	Jun 8, 2015
Publication date	Jan 9, 2018
Grant date	Jan 9, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In implementations of the subject matter described herein, the feature maps are obtained by convoluting an input image using a plurality of layers of convolution filters. The feature maps record semantic information for respective regions on the image and only need to be computed once. Segment features of the image are extracted from the convolutional feature maps. Particularly, the binary masks may be obtained from a set of candidate segments of the image. The binary masks are used to mask the feature maps instead of the raw image. The masked feature maps define the segment features. The semantic segmentation of the image is done by determining a semantic category for each pixel in the image at least in part based on the resulting segment features.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: applying a sequence of convolution filtering on an image to obtain feature maps, the feature maps including a plurality of activations, each of the activations representing semantic information for a region on the image; masking the feature maps with binary masks to generate segment features of the image, each of the binary masks representing a candidate segment of the image; and determining a semantic category for each pixel in the image at least in part based on the segment features. 2. The method of claim 1 , wherein masking the feature maps comprises: generating low-resolution binary masks based on the binary masks and the feature maps; and applying the low-resolution binary masks onto the feature maps to generate the segment features. 3. The method of claim 2 , wherein generating the low-resolution binary masks comprises: projecting each of the activations on the feature maps to a center of the respective region on the image; associating each pixel in the binary masks with the nearest center; and assigning each pixel in the binary masks to one of the activations on the feature maps based on the associated center. 4. The method of claim 3 , wherein generating the low-resolution binary masks further comprises: averaging values of pixels assigned to each of the activations; and generating the low-resolution binary masks by comparing the averaged values and a predetermined threshold. 5. The method of claim 1 , wherein masking the feature maps comprises: directly masking the feature maps to generate the segment features, and wherein determining the semantic category for each pixel in the image comprises: pooling the segment features; and connecting the pooled segment features. 6. The method of claim 5 , wherein determining the semantic category for each pixel in the image further comprises: pooling regional features on the feature maps, each of the regional features being represented by a bounding box; connecting the pooled regional features; and determining the semantic category for each pixel in the image based on a concatenation of the connected segment features and the connected regional features. 7. The method of claim 5 , wherein at least one of the segment features and the regional features are pooled by spatial pyramid pooling (SPP). 8. The method of claim 1 , wherein masking the feature maps comprises: pooling the generated feature maps by spatial pyramid pooling (SPP) to obtain multiple levels of a pooled feature map; and masking the pooled feature map of a tiny level from the multiple levels to generate the segment features. 9. The method of claim 8 , wherein determining the semantic category for each pixel in the image comprises: connecting the segment features and the pooled feature map of other levels among from the multiple levels. 10. A computer program product being tangibly stored on a non-transient machine-readable medium and comprising machine-executable instructions, the instructions, when executed on a device, causing the device to: apply a sequence of convolution filtering on an image to obtain feature maps, the feature maps including a plurality of activations, each of the activations representing semantic information for a region on the image; mask the feature maps with binary masks to generate segment features of the image, each of the binary masks representing a candidate segment of the image; and determine a semantic category for each pixel in the image at least in part based on the segment features. 11. The computer program product of claim 10 , wherein the instructions, when executed on the device, cause the device to: generate low-resolution binary masks based on the binary masks and the feature maps; and apply the low-resolution binary masks onto the feature maps to generate the segment features. 12. The computer program product of claim 10 , wherein the instructions, when executed on the device, cause the device to: project each of the activations on the feature maps to a center of the respective region on the image; associate each pixel in the binary masks with the nearest center; and assign each pixel in the binary masks to one of the activations on the feature maps based on the associated center. 13. The computer program product of claim 10 , wherein the instructions, when executed on the device, cause the device to: average values of pixels assigned to each of the activations; and generate the low-resolution binary masks by comparing the averaged values and a predetermined threshold. 14. A computing device, comprising: at least one memory and at least one processor, wherein the at least one memory and the at least one memory are respectively configured to store and execute instructions for causing the computing device to perform operations, the operations including: applying a sequence of convolution filtering on an image to obtain feature maps, the feature maps including a plurality of activations, each of the activations representing semantic information for a region on the image; masking the feature maps with binary masks to generate segment features of the image, each of the binary masks representing a candidate segment of the image; and determining a semantic category for each pixel in the image at least in part based on the segment features. 15. The computing device of claim 14 , wherein masking the feature maps comprises: generating low-resolution binary masks based on the binary masks and the feature maps; and applying the low-resolution binary masks onto the feature maps to generate the segment features. 16. The computing device of claim 15 , wherein generating the low-resolution binary masks comprises: projecting each of the activations on the feature maps to a center of the respective region on the image; associating each pixel in the binary masks with the nearest center; and assigning each pixel in the binary masks to one of the activations on the feature maps based on the associated center. 17. The computing device of claim 16 , wherein generating the low-resolution binary masks further comprises: averaging values of pixels assigned to each of the activations; and generating the low-resolution binary masks by comparing the averaged values and a predetermined threshold. 18. The computing device of claim 16 , wherein determining the semantic category for each pixel in the image further comprises: pooling regional features on the feature maps, each of the regional features being represented by a bounding box; connecting the pooled regional features; and determining the semantic category for each pixel in the image based on a concatenation of the connected segment features and the connected regional features. 19. The computing device of claim 16 , wherein at least one of the segment features and the regional features are pooled by spatial pyramid pooling (SPP). 20. The computing device of claim 14 , wherein masking the feature maps comprises: directly masking the feature maps to generate the segment features, and wherein determining the semantic category for each pixel in the image comprises: pooling the segment features; and connecting the pooled segment features.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06T5/10Primary
using non-spatial domain filtering · CPC title
G06T2207/20081
Training; Learning · CPC title
G06T7/11Primary
Region-based segmentation · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

View patent family 57452254

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9865042B2 cover?: In implementations of the subject matter described herein, the feature maps are obtained by convoluting an input image using a plurality of layers of convolution filters. The feature maps record semantic information for respective regions on the image and only need to be computed once. Segment features of the image are extracted from the convolutional feature maps. Particularly, the binary mask…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06T5/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).