Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Object detection and classification in images

US9858496B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9858496-B2
Application number	US-201615001417-A
Country	US
Kind code	B2
Filing date	Jan 20, 2016
Priority date	Jan 20, 2016
Publication date	Jan 2, 2018
Grant date	Jan 2, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer-readable media for providing fast and accurate object detection and classification in images are described herein. In some examples, a computing device can receive an input image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate proposals for candidate objects in the image. In various examples, the computing device can process the convolutional feature map with the proposals through a Fast Region-Based Convolutional Neural Network (FRCN) proposal classifier to determine a class of each object in the image and a confidence score associated therewith. The computing device can then provide a requestor with an output including the object classification and/or confidence score.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving an input image; generating a convolutional feature map; identifying, by a first type of neural network, a candidate object in the input image; determining, by a second type of neural network, a category of the candidate object; and assigning a confidence score to the category of the candidate object, wherein the first type of neural network comprises a translation invariant component configured to: classify an anchor based on overlap with a ground-truth box; and predict a shift and a scale of the anchor. 2. A method as claim 1 recites, wherein the convolutional feature map is generated by a Zeiler and Fergus model or a Simonyan and Zisserman model deep convolutional neural network. 3. A method as claim 1 recites, further comprising training the convolutional feature map, the first type of neural network, and the second type of neural network using at least one of: stochastic gradient descent; or back-propagation. 4. A method as comprising: receiving an input image; generating a convolutional feature map; identifying, by a first type of neural network, a candidate object in the input image, wherein the identifying the candidate object in the input image comprises: generating one or more anchors at a point of the input image; determining an overlap of individual ones of the one or more anchors to a ground-truth box; assigning a label to each anchor of the one or more anchors based at least in part on the overlap; assigning a score to the label based at least in part on the overlap; and identifying the candidate object at the point based at least in part on the score; determining, by a second type of neural network, a category of the candidate object, wherein the first type of neural network and the second type of neural network share at least one algorithm; and assigning a confidence score to the category of the candidate object. 5. A method as claim 4 recites, wherein the identifying the candidate object in the input image further comprises: identifying an anchor corresponding to a highest score, the highest score corresponding to a percentage of the overlap; shifting the anchor corresponding to the highest score to better define the candidate object; and scaling the anchor corresponding to the highest score to better define the candidate object. 6. A method as claim 4 recites, wherein the generating the one or more anchors at the point of the input image comprises generating a set of anchor boxes; the set anchor boxes having three scales and three aspect ratios. 7. A method as claim 4 recites, wherein the label is positive when the overlap exceeds a threshold level. 8. A system comprising: a processor; and a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising: an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image, and further comprising a translation invariant component configured to: classify an anchor based on overlap with a ground-truth box; and predict a shift and a scale of the anchor; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer. 9. A system as claim 8 recites, wherein the proposal classifier module is further configured to assign a confidence score to the classification. 10. A system as claim 8 recites, wherein the object proposal module is further configured to: generate one or more anchors at a point of the image; determine an overlap of each anchor of the one or more anchors to a ground-truth box; assign a label to each anchor of the one or more anchor based at least in part on the overlap; assign a score to the label based at least in part on the overlap; select an anchor with a highest score; and generate the proposal based at least in part on the highest score. 11. A system comprising: a processor; and a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising: an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image, wherein the object proposal module is further configured to: identify an anchor corresponding to a highest score, the highest score corresponding to a percentage of the overlap; shift the anchor corresponding to the highest score to better define the candidate object; or scale the anchor corresponding to the highest score to better define the candidate object; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer. 12. A system comprising: a processor; a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising: an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer; and a machine learning module configured to: train one or more parameters of the initial processing module and the object proposal module to generate one or more proposals on a training image; and train one or more parameters of the proposal classifier module to assign a category to each of the one or more proposals on the training image. 13. A system as claim 12 recites, wherein the machine learning module is further configured to train the one or more parameters of the initial processing module, the object proposal module, and the proposal classifier module using one or more of: stochastic gradient descent; or back-propagation. 14. A non-transitory computer readable storage medium having instructions stored thereon, the instructions when executed by a computing device cause the computing device to: receive an input image; generate a convolutional feature map; generate one or more anchors at a point of the input image; determine an overlap of individual ones of the one or more anchors to a ground-truth box; assign a label to each anchor of the one or more anchors based at least in part on the overlap; assign a score to the label based at least in part on the overlap; identify, by a neural network, a candidate object in the input image, the candidate object at the point based at least in part on the score; determine, by a proposal classifier sharing an algorithm with the neural network, a category of the candidate object; and assign, by the proposal classifier, a confidence score to the category of the candidate object. 15. A non-transitory computer readable storage medium as claim 14 recites, wherein the neural network is a region proposal network and the proposal classifier is a co

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/084Primary
Backpropagation, e.g. using gradient descent · CPC title
G06F18/24
Classification techniques · CPC title
G06N3/045
Combinations of networks · CPC title
G06V10/454
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
G06V10/25
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

Patent family

Related publications grouped by family.

View patent family 59314389

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9858496B2 cover?: Systems, methods, and computer-readable media for providing fast and accurate object detection and classification in images are described herein. In some examples, a computing device can receive an input image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Net…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Object detection using cascaded convolutional neural networks

Rapid object detection by combining structural information from image segmentation with bio-inspired attentional mechanisms

Convolutional-neural-network-based classifier and classifying method and training methods for the same

Frequently asked questions