What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Augmenting layer-based object detection with deep convolutional neural networks

US9542626B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9542626-B2
Application number	US-201615048757-A
Country	US
Kind code	B2
Filing date	Feb 19, 2016
Priority date	Sep 6, 2013
Publication date	Jan 10, 2017
Grant date	Jan 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

By way of example, the technology disclosed by this document receives image data; extracts a depth image and a color image from the image data; creates a mask image by segmenting the depth image; determines a first likelihood score from the depth image and the mask image using a layered classifier; determines a second likelihood score from the color image and the mask image using a deep convolutional neural network; and determines a class of at least a portion of the image data based on the first likelihood score and the second likelihood score. Further, the technology can pre-filter the mask image using the layered classifier and then use the pre-filtered mask image and the color image to calculate a second likelihood score using the deep convolutional neural network to speed up processing.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for performing object recognition comprising: receiving image data; extracting a depth image and a color image from the image data; creating a mask image by segmenting the image data into a plurality of components; identifying objects within the plurality of components of the mask image; determining a first likelihood score from the depth image and the mask image using a layered classifier; determining a second likelihood score from the color image and the mask image by generating an object image by copying pixels from a first image of the components in the mask image and classifying the object image using a deep convolutional neural network (CNN); and determining a class for at least a portion of the image data based on the first likelihood score and the second likelihood score. 2. A computer-implemented method for performing object recognition comprising: receiving image data; creating a mask image by segmenting the image data into a plurality of components; determining a first likelihood score from the image data and the mask image using a layered classifier; determining a second likelihood score from the image data and the mask image using a deep convolutional neural network (CNN); and determining a class for at least a portion of the image data based on the first likelihood score and the second likelihood score. 3. The computer-implemented method of claim 2 , wherein the determining the second likelihood score from the image data and the mask image using the deep CNN includes: extracting a first image from the image data; generating an object image by copying pixels from the first image of the components in the mask image; classifying the object image using the deep CNN; generating classification likelihood scores indicating probabilities of the object image belonging to different classes of the deep CNN; and generating the second likelihood score based on the classification likelihood scores. 4. The computer-implemented method of claim 3 , wherein the first image is one of a color image, a depth image, and a combination of a color image and a depth image. 5. The computer-implemented method of claim 2 , wherein determining the class of at least the portion of the image data includes: fusing the first likelihood score and the second likelihood score into an overall likelihood score; and responsive to satisfying a predetermined threshold with the overall likelihood score, classifying the at least the portion of the image data as representing a person using the overall likelihood score. 6. The computer-implemented method of claim 2 , further comprising: extracting a depth image and a color image from the image data, wherein determining the first likelihood score from the image data and the mask image using the layered classifier includes determining the first likelihood score from the depth image and the mask image using the layered classifier, and determining the second likelihood score from the image data and the mask image using the deep CNN includes determining the second likelihood score from the color image and the mask image using the deep CNN. 7. The computer-implemented method of claim 2 , wherein the deep CNN has a soft max layer as a final layer to generate the second likelihood score that the at least the portion of the image data represents a person. 8. The computer-implemented method of claim 2 , further comprising: converting the first likelihood score and the second likelihood score into a first log likelihood value and a second log likelihood value; and calculating a combined likelihood score by using a weighted summation of the first log likelihood value and the second log likelihood value. 9. The computer-implemented method of claim 2 , wherein the class is a person. 10. The computer-implemented method of claim 2 , wherein determining the second likelihood score further comprises: determining the second likelihood score using the image data and the first likelihood score from the layered classifier. 11. A system for performing object recognition comprising: a processor; and a memory storing instructions that, when executed, cause the system to: create a mask image by segmenting image data into a plurality of components; determine a first likelihood score from the image data and the mask image using a layered classifier; determine a second likelihood score from the image data and the mask image using a deep convolutional neural network (CNN); and determine a class for at least a portion of the image data based on the first likelihood score and the second likelihood score. 12. The system of claim 11 , wherein the instructions that cause the system to determine the second likelihood score from the image data and the mask image using the deep CNN further cause the system to: extract a first image from the image data; generate an object image by copying pixels from the first image of the components in the mask image; classify the object image using the deep CNN; generate classification likelihood scores indicating probabilities of the object image belonging to different classes of the deep CNN; and generate the second likelihood score based on the classification likelihood scores. 13. The system of claim 12 , wherein the first image is one of a color image, a depth image, and a combination of a color image and a depth image. 14. The system claim 11 , wherein the instructions that cause the system to determine the class of at least the portion of the image data further cause the system to: fuse the first likelihood score and the second likelihood score into an overall likelihood score; and responsive to satisfying a predetermined threshold with the overall likelihood score, classify the at least the portion of the image data as representing a person using the overall likelihood score. 15. The system of claim 11 , wherein the memory stores further instructions that cause the system to: extract a depth image and a color image from the image data, wherein determining the first likelihood score from the image data and the mask image using the layered classifier includes determining the first likelihood score from the depth image and the mask image using the layered classifier, and determining the second likelihood score from the image data and the mask image using the deep CNN includes determining the second likelihood score from the color image and the mask image using the deep CNN. 16. The system of claim 11 wherein the deep CNN has a soft max layer as a final layer to generate the second likelihood score that the at least the portion of the image data represents a person. 17. The system of claim 11 , wherein the memory stores further instructions that cause the system to: convert the first likelihood score and the second likelihood score into a first log likelihood value and a second log likelihood value; and calculate a combined likelihood score by using a weighted summation of the first log likelihood value and the second log likelihood value. 18. The system of claim 11 , wherein the class is a person. 19. The system of claim 11 , wherein the instructions that cause the system to determine the second likelihood score further cause the system to: pre-filter the mask image using the layered classifier; and determine the second likelihood score using the image data and the pre-filtered mask image. 20. The system of claim 11 , wherein the layered classifier determines the first like

Assignees

Toyota Motor Co Ltd

Inventors

Classifications

G06V10/84
using probabilistic graphical models from image or video features, e.g. Markov models or Bayesian networks · CPC title
G06V10/82Primary
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06F18/2431
Multiple classes · CPC title
G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title

Patent family

Related publications grouped by family.

View patent family 59683210

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9542626B2 cover?: By way of example, the technology disclosed by this document receives image data; extracts a depth image and a color image from the image data; creates a mask image by segmenting the depth image; determines a first likelihood score from the depth image and the mask image using a layered classifier; determines a second likelihood score from the color image and the mask image using a deep convolu…
Who is the assignee on this patent?: Toyota Motor Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Generic object detection in images

Identifying objects in images

Detection device, detection program, detection method, vehicle equipped with detection device, parameter calculation device, parameter calculating parameters, parameter calculation program, and method of calculating parameters

Method and system for anatomical object detection using marginal space deep neural networks

Automatic Detection Of Mitosis Using Handcrafted And Convolutional Neural Network Features

Regionlets with Shift Invariant Neural Patterns for Object Detection

Sequence transcription with deep neural networks

Frequently asked questions