What technology area does this patent fall under?

Primary CPC classification G06V10/454. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 01 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Object detection using deep neural networks

Patent metadata
Field	Value
Publication number	US-9275308-B2
Application number	US-201414288194-A
Country	US
Kind code	B2
Filing date	May 27, 2014
Priority date	May 31, 2013
Publication date	Mar 1, 2016
Grant date	Mar 1, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting objects in images. One of the methods includes receiving an input image. A full object mask is generated by providing the input image to a first deep neural network object detector that produces a full object mask for an object of a particular object type depicted in the input image. A partial object mask is generated by providing the input image to a second deep neural network object detector that produces a partial object mask for a portion of the object of the particular object type depicted in the input image. A bounding box is determined for the object in the image using the full object mask and the partial object mask.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising: receiving, by one or more computers, an input image; generating, by one or more computers, a full object mask by providing the input image to a first deep neural network object detector that produces a full object mask for an object of a particular object type depicted in the input image, wherein the full object mask identifies regions of the input image that correspond to the object and regions of the input image that do not correspond to the object; generating, by one or more computers, a partial object mask by providing the input image to a second deep neural network object detector that produces a partial object mask for a portion of the object of the particular object type depicted in the input image; and determining, by one or more computers, a bounding box for the object in the image using the full object mask and the partial object mask. 2. The method of claim 1 , wherein the portion of the object corresponds to the bottom portion, the top portion, the left portion, or the right portion of the object. 3. The method of claim 1 , wherein the bounding box has a partial bounding box corresponding to the partial object mask, and wherein determining the bounding box for the object in the image using the full object mask and the partial object mask comprises determining a bounding box that has a best fit, among a plurality of candidate bounding boxes, of the full bounding box with the full object mask and the partial bounding box with the partial object mask. 4. The method of claim 1 , wherein determining a bounding box for the object in the image using the full object mask and the partial object mask comprises: computing a score for each of a plurality of candidate bounding boxes based on a first measure of overlap between the bounding box and the full object mask and a second measure of overlap between the bounding box and the partial object mask; and determining a bounding box having a highest score. 5. The method of claim 4 , wherein the score for a candidate bounding box bb is given by: S ⁡ ( bb ) = ∑ h ⁢ ⁢ ( S ⁡ ( bb ⁡ ( h ) , m h ) - S ⁡ ( bb ⁡ ( h _ ) , m h _ ) ) wherein bb(h) is a partial bounding box for a corresponding partial object mask m h , wherein bb( h ) is an opposite partial bounding box for a corresponding opposite partial object mask m h , and S(bb(h), m h ) is a measure of overlap between the partial bounding box and the partial object mask. 6. The method of claim 4 , further comprising: generating a second full object mask by providing a portion of the image corresponding to the bounding box to the first deep neural network object detector; generating a second partial object mask by providing the portion of the image corresponding to the bounding box to the second deep neural network object detector; computing a score for each of a second plurality of candidate bounding boxes based on a third measure of overlap between each bounding box and the second full object mask and a fourth measure of overlap between each bounding box and the second partial object mask; and determining a refined bounding box of the second plurality of candidate bounding boxes having a highest score. 7. The method of claim 1 , further comprising: determining full and partial object masks for subwindows of each of multiple subwindow scales; and merging object masks determined at same values of the subwindow scales. 8. The method of claim 7 , wherein merging the object masks determined at the multiple values of the scale s comprises averaging the object masks. 9. The method of claim 1 , wherein determining full and partial object masks for subwindows of each of multiple subwindow scales comprises determining full and partial object masks for no more than 50 subwindows. 10. The method of claim 1 , further comprising: generating a predicted object type by providing the input image to a third deep neural network classifier that produces a predicted object type for a portion of the image corresponding to the bounding box; determining that the predicted object type does not match the particular object type; and removing the bounding box from consideration. 11. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving an input image; generating a full object mask by providing the input image to a first deep neural network object detector that produces a full object mask for an object of a particular object type depicted in the input image, wherein the full object mask identifies regions of the input image that correspond to the object and regions of the input image that do not correspond to the object; generating a partial object mask by providing the input image to a second deep neural network object detector that produces a partial object mask for a portion of the object of the particular object type depicted in the input image; and determining a bounding box for the object in the image using the full object mask and the partial object mask. 12. The system of claim 11 , wherein the portion of the object corresponds to the bottom portion, the top portion, the left portion, or the right portion of the object. 13. The system of claim 11 , wherein the bounding box has a partial bounding box corresponding to the partial object mask, and wherein determining t

Assignees

Google Inc

Inventors

Classifications

G06V10/454Primary
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
G06V30/194
References adjustable by an adaptive method, e.g. learning · CPC title
G06K9/66Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 53368888

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9275308B2 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting objects in images. One of the methods includes receiving an input image. A full object mask is generated by providing the input image to a first deep neural network object detector that produces a full object mask for an object of a particular object type depicted in the input image. A …
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G06V10/454. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 01 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).