Optimizations for Dynamic Object Instance Detection, Segmentation, and Structure Mapping
US-2019172223-A1 · Jun 6, 2019 · US
US11100352B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11100352-B2 |
| Application number | US-201916576278-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 19, 2019 |
| Priority date | Oct 16, 2018 |
| Publication date | Aug 24, 2021 |
| Grant date | Aug 24, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is a computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to generate a feature map from an input image, to extract a region of interest from the feature map, and to generate a predicted mask based on the region of interest. The processing circuitry may use a predicted mask and a real mask to learn a convolutional neural network system. The real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary.
Opening claim text (preview).
What is claimed is: 1. A computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to: perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object and position information of the object on the input image based on the aligned region of interest; form a boundary encompassing the object on the input image based on a result of the determination; and learn a convolutional neural network system based on a predicted mask based on the class, the position information, the boundary, and/or a real mask based on a real boundary of the object of the input image, wherein the real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary. 2. The computer-readable medium of claim 1 , wherein a pixel value of the first pixels is greater than a pixel value of the second pixels. 3. The computer-readable medium of claim 1 , wherein a pixel value of the first pixels is the same as a pixel value of the second pixels. 4. The computer-readable medium of claim 1 , wherein the second pixels are adjacent to an outer side of the real boundary composed of the first pixels, wherein the fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an inner side of the real boundary. 5. The computer-readable medium of claim 1 , wherein the second pixels are adjacent to an inner side of the real boundary composed of the first pixels, wherein the fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an outer side of the real boundary. 6. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to further search the feature map in a window sliding manner by using a plurality of anchors, in the extracting of the region of interest. 7. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to further perform a fully-connected operation on the aligned region of interest in the forming of the boundary, and wherein the class and the position information of the object are generated based on a result of the fully-connected operation. 8. The computer-readable medium of claim 1 , wherein a thickness of the fake boundary is greater than a thickness of the real boundary. 9. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to learn the convolutional neural network system through a backpropagation, based on error information that is based on the predicted mask and the real mask. 10. The computer-readable medium of claim 1 , wherein the program code, when executed by the processing circuitry, causes the processing circuitry to perform the determining of the class of the object, the determining of the position information of the object, and the forming of the boundary in parallel. 11. A computer-readable medium including a program code that, when executed by processing circuitry, causes the processing circuitry to: perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object and position information of the object on the input image based on the aligned region of interest; form a boundary encompassing the object on the input image based on a result of the determination; and learn a convolutional neural network system based on a predicted mask based on the class, the position information, and the boundary, and a real mask including a real bounding box encompassing the object of the input image, wherein the real mask includes first pixels corresponding to the real bounding box and second pixels corresponding to a fake bounding box adjacent to the real bounding box. 12. The computer-readable medium of claim 11 , wherein the second pixels are adjacent to an outer side of the real bounding box composed of the first pixels, wherein the fake bounding box is a first fake bounding box, and wherein the real mask further includes third pixels corresponding to the first fake bounding box adjacent to an inner side of the real bounding box. 13. The computer-readable medium of claim 12 , wherein a pixel value of the first pixels, a pixel value of the second pixels, and a pixel value of the third pixels are the same. 14. The computer-readable medium of claim 12 , wherein a pixel value of the first pixels is greater than a pixel value of the second pixels and a pixel value of the third pixels. 15. The computer-readable medium of claim 12 , wherein the second pixels are adjacent to an inner side of the real bounding box composed of the first pixels, wherein the fake bounding box is a first fake bounding box, and wherein the real mask further includes third pixels corresponding to the first fake bounding box adjacent to an outer side of the real bounding box. 16. A convolutional neural network system comprising: processing circuitry configured to, perform a convolution operation on an input image to generate a feature map; extract a region of interest based on an objectness score associated with an existence of an object from the feature map; align the extracted region of interest to a region of interest having a reference size; determine a class of the object, based on the aligned region of interest; determine position information of the object on the input image, based on the aligned region of interest; and form a boundary encompassing the object on the input image, and learn based on a predicted mask based on the class, the position information, the boundary, and/or a real mask based on a real boundary of the object of the input image, wherein the real mask includes first pixels corresponding to the real boundary and second pixels corresponding to a fake boundary adjacent to the real boundary. 17. The convolutional neural network system of claim 16 , wherein the processing circuitry is configure dot extract the region of interest using a region proposal network (RPN). 18. The convolutional neural network system of claim 16 , wherein the processing circuitry is configured to determine the class of the object and the position information of the object on the input image using a plurality of fully-connected networks. 19. The convolutional neural network system of claim 16 , wherein the second pixels are adjacent to an outer side of the real boundary composed of the first pixels, wherein a fake boundary is a first fake boundary, and wherein the real mask further includes third pixels corresponding to the first fake boundary adjacent to an inner side of the real boundary. 20. The convolutional neural network system of claim 19 , wherein a pixel value of the first pixels, a pixel value of the second pixels, and a pixel value of the third pi
Edge-based segmentation · CPC title
using neural networks · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
Determination of region of interest [ROI] or a volume of interest [VOI] · CPC title
Activation functions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.