Combining convolution and deconvolution for object detection

US10628705B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10628705-B2
Application numberUS-201815940907-A
CountryUS
Kind codeB2
Filing dateMar 29, 2018
Priority dateMar 29, 2018
Publication dateApr 21, 2020
Grant dateApr 21, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided are systems, methods, and computer-readable medium for operating a neural network. In various implementations, the neural network can receive an input image that includes an object to be identified. The neural network can generate a plurality of initial feature maps using a convolution layers, wherein a first initial feature maps is generated using the input image. The neural network can generate an up-sampled feature map using a de-convolution layer that takes an initial feature map as input, where the up-sampled feature map has a same resolution as the previous initial feature map. The neural network can combine the up-sampled feature map and the previous initial feature map, and use the combined feature map to more accurate identify the object.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for object identification, comprising: receiving an input image including a representation of an object to be identified; generating a plurality of feature maps using a plurality of convolution layers of a neural network, wherein a first feature map of the plurality of feature maps is generated from the input image, and wherein one or more feature maps of the plurality of feature maps are generated from one or more previous feature maps of the plurality of feature maps; generating, using a de-convolution layer of the neural network, an up-sampled feature map from a feature map of the one or more feature maps, wherein the up-sampled feature map has a same resolution as a previous feature map of the plurality of feature maps; combining the previous feature map and the up-sampled feature map to produce a combined feature map; and identifying the object represented in the input image using the combined feature map. 2. The method of claim 1 , wherein the previous feature map and the up-sampled feature map are combined using a concatenation operation. 3. The method of claim 1 , wherein the previous feature map and the up-sampled feature map are combined using a maximum value operation. 4. The method of claim 1 , wherein the combined feature map has a greater depth than the previous feature map. 5. The method of claim 1 , wherein the combined feature map has a same depth as the previous feature map. 6. The method of claim 1 , wherein the up-sampled feature map is generated without using a rectified linear unit operation. 7. The method of claim 1 , wherein the plurality of convolution layers perform a convolution to produce the plurality of feature maps. 8. The method of claim 1 , wherein the de-convolution layer performs a deconvolution on the feature map to produce the up-sampled feature map. 9. The method of claim 1 , wherein identifying the object further includes using a highest level feature map generated by the neural network. 10. The method of claim 1 , wherein identifying the object further includes categorizing the combined feature map using at least one fully-connected layer, wherein each node in the at least one fully-connected layer outputs a weighted sum that indicates a probable category. 11. The method of claim 1 , further comprising: generating, using a second de-convolution layer of the neural network, a second up-sampled feature map from the combined feature map, wherein the second up-sampled feature map has a same resolution as a second previous feature map from the plurality of feature maps; and combining the second previous feature map with the second up-sampled feature map to produce a second combined feature map, wherein identifying the object is further based on the second combined feature map. 12. The method of claim 1 , wherein the previous feature map is generated using a convolution layer from the plurality of convolution layers that precedes a convolution layer from the plurality of convolution layers used to generate the feature map. 13. An apparatus, comprising: a memory configured to store an input image including a representation of an object to be identified; and a processor configured to: generate a plurality of feature maps using a plurality of convolution layers of a neural network, wherein a first feature map of the plurality of feature maps is generated from the input image, and wherein one or more feature maps of the plurality of feature maps are generated from one or more previous feature maps of the plurality of feature maps; generate, using a de-convolution layer of the neural network, an up-sampled feature map from a feature map of the one or more feature maps, wherein the up-sampled feature map has a same resolution as a previous feature map of the plurality of feature maps; combine the previous feature map and the up-sampled feature map to produce a combined feature map; and identify the object represented in the input image using the combined feature map. 14. The apparatus of claim 13 , wherein the previous feature map and the up-sampled feature map are combined using a concatenation operation. 15. The apparatus of claim 13 , wherein the previous feature map and the up-sampled feature map are combined using a maximum value operation. 16. The apparatus of claim 13 , wherein the combined feature map has a greater depth than the previous feature map. 17. The apparatus of claim 13 , wherein the combined feature map has a same depth as the previous feature map. 18. The apparatus of claim 13 , wherein the up-sampled feature map is generated without using a rectified linear unit operation. 19. The apparatus of claim 13 , wherein the plurality of convolution layers perform a convolution to produce the plurality of feature maps. 20. The apparatus of claim 13 , wherein the de-convolution layer performs a deconvolution on the feature map to produce the up-sampled feature map. 21. The apparatus of claim 13 , wherein identifying the object further includes using a highest level feature map generated by the neural network. 22. The apparatus of claim 13 , wherein identifying the object further includes categorizing the combined feature map using at least one fully-connected layer, wherein each node in the at least one fully-connected layer outputs a weighted sum that indicates a probable category. 23. The apparatus of claim 13 , wherein the processor is further configured to: generate, using a second de-convolution layer of the neural network, a second up-sampled feature map from the combined feature map, wherein the second up-sampled feature map has a same resolution as a second previous feature map from the plurality of feature maps; and combine the second previous feature map with the second up-sampled feature map to produce a second combined feature map, wherein identifying the object is further based on the second combined feature map. 24. The apparatus of claim 13 , wherein the previous feature map is generated using a convolution layer from the plurality of convolution layers that precedes a convolution layer from the plurality of convolution layers used to generate the feature map. 25. A non-transitory computer-readable medium having stored thereon instructions that, when executed by a processor, cause the processor to perform operations including: receiving an input image including a representation of an object to be identified; generating a plurality of feature maps using a plurality of convolution layers of a neural network, wherein a first feature map of the plurality of feature maps is generated from the input image, and wherein one or more feature maps of the plurality of initial feature maps are generated from one or more previous feature maps of the plurality of feature maps; generating, using a de-convolution layer of the neural network, an up-sampled feature map from a feature map of the one or more feature maps, wherein the up-sampled feature map has a same resolution as a previous feature map of the plurality of feature maps; combining the previous feature map and the up-sampled feature map to produce a combined feature map; and identifying the object represented in the input image using the combined feature map. 26. The non-transitory computer-readable medium of claim 25 , wherein the previous feature map and the up-sampled feature map are combined using a concatenation operation.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10628705B2 cover?
Provided are systems, methods, and computer-readable medium for operating a neural network. In various implementations, the neural network can receive an input image that includes an object to be identified. The neural network can generate a plurality of initial feature maps using a convolution layers, wherein a first initial feature maps is generated using the input image. The neural network c…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 21 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).