Convolutional neural network for object detection

US11100352B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11100352-B2
Application numberUS-201916576278-A
CountryUS
Kind codeB2
Filing dateSep 19, 2019
Priority dateOct 16, 2018
Publication dateAug 24, 2021
Grant dateAug 24, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to generate a feature map from an input image, to extract a region of interest from the feature map, and to generate a predicted mask based on the region of interest. The processing circuitry may use a predicted mask and a real mask to learn a convolutional neural network system. The real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to: perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object and position information of the object on the input image based on the aligned region of interest; form a boundary encompassing the object on the input image based on a result of the determination; and learn a convolutional neural network system based on a predicted mask based on the class, the position information, the boundary, and/or a real mask based on a real boundary of the object of the input image, wherein the real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary. 2. The computer-readable medium of claim 1 , wherein a pixel value of the first pixels is greater than a pixel value of the second pixels. 3. The computer-readable medium of claim 1 , wherein a pixel value of the first pixels is the same as a pixel value of the second pixels. 4. The computer-readable medium of claim 1 , wherein the second pixels are adjacent to an outer side of the real boundary composed of the first pixels, wherein the fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an inner side of the real boundary. 5. The computer-readable medium of claim 1 , wherein the second pixels are adjacent to an inner side of the real boundary composed of the first pixels, wherein the fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an outer side of the real boundary. 6. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to further search the feature map in a window sliding manner by using a plurality of anchors, in the extracting of the region of interest. 7. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to further perform a fully-connected operation on the aligned region of interest in the forming of the boundary, and wherein the class and the position information of the object are generated based on a result of the fully-connected operation. 8. The computer-readable medium of claim 1 , wherein a thickness of the fake boundary is greater than a thickness of the real boundary. 9. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to learn the convolutional neural network system through a backpropagation, based on error information that is based on the predicted mask and the real mask. 10. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to perform the determining of the class of the object, the determining of the position information of the object, and the forming of the boundary in parallel. 11. A computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to: perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object and position information of the object on the input image based on the aligned region of interest; form a boundary encompassing the object on the input image based on a result of the determination; and learn a convolutional neural network system based on a predicted mask based on the class, the position information, and the boundary, and a real mask including a real bounding box encompassing the object of the input image, wherein the real mask includes first pixels corresponding to the real bounding box and second pixels corresponding to a fake bounding box adjacent to the real bounding box. 12. The computer-readable medium of claim 11 , wherein the second pixels are adjacent to an outer side of the real bounding box composed of the first pixels, wherein the fake bounding box is a first fake bounding box, and wherein the real mask further includes third pixels corresponding to the first fake bounding box adjacent to an inner side of the real bounding box. 13. The computer-readable medium of claim 12 , wherein a pixel value of the first pixels, a pixel value of the second pixels, and a pixel value of the third pixels are the same. 14. The computer-readable medium of claim 12 , wherein a pixel value of the first pixels is greater than a pixel value of the second pixels and a pixel value of the third pixels. 15. The computer-readable medium of claim 12 , wherein the second pixels are adjacent to an inner side of the real bounding box composed of the first pixels, wherein the fake bounding box is a first fake bounding box, and wherein the real mask further includes third pixels corresponding to the first fake bounding box adjacent to an outer side of the real bounding box. 16. A convolutional neural network system comprising: processing circuitry configured to, perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object, based on the aligned region of interest; determine position information of the object on the input image, based on the aligned region of interest; and form a boundary encompassing the object on the input image, and learn based on a predicted mask based on the class, the position information, the boundary, and/or a real mask based on a real boundary of the object of the input image, wherein the real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary. 17. The convolutional neural network system of claim 16 , wherein the processing circuitry is configure dot extract the region of interest using a region proposal network (RPN). 18. The convolutional neural network system of claim 16 , wherein the processing circuitry is configured to determine the class of the object and the position information of the object on the input image using a plurality of fully-connected networks. 19. The convolutional neural network system of claim 16 , wherein the second pixels are adjacent to an outer side of the real boundary composed of the first pixels, wherein a fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an inner side of the real boundary. 20. The convolutional neural network system of claim 19 , wherein a pixel value of the first pixels, a pixel value of the second pixels, and a pixel value of the third pi

Assignees

Inventors

Classifications

  • G06T7/12Primary

    Edge-based segmentation · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title

  • Activation functions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11100352B2 cover?
Disclosed is a computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to generate a feature map from an input image, to extract a region of interest from the feature map, and to generate a predicted mask based on the region of interest. The processing circuitry may use a predicted mask and a real mask to learn a convolution…
Who is the assignee on this patent?
Samsung Electronics Co Ltd, Univ Ajou Ind Academic Coop Found
What technology area does this patent fall under?
Primary CPC classification G06T7/12. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 24 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).