Optimizations for Dynamic Object Instance Detection, Segmentation, and Structure Mapping

US2019171903A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019171903-A1
Application numberUS-201815971930-A
CountryUS
Kind codeA1
Filing dateMay 4, 2018
Priority dateDec 3, 2017
Publication dateJun 6, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a system may access an image and generate a feature map for the image using a neural network. The system may identify regions of interest in the feature map. Regional feature maps may be generated for the regions of interest, respectively. Each of the regional feature maps has a first, a second, and a third dimension. The system may generate a first combined regional feature map by combining the regional feature maps. The combined regional feature map has a first, a second, and a third dimension. The system may generate a second combined regional feature map by processing the first combined regional feature map using one or more convolutional layers. The system may generate, for each of the regions of interest, information associated with an object instance based on a portion of the second combined regional feature map associated with that region of interest.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising, by a computing system: accessing an image; generating a feature map for the image using a neural network; identifying a plurality of regions of interest in the feature map; generating a plurality of regional feature maps for the plurality of regions of interest, respectively, wherein each of the plurality of regional feature maps has a first dimension, a second dimension, and a third dimension; generating a first combined regional feature map by combining the plurality of regional feature maps, wherein the combined regional feature map has a first dimension, a second dimension, and a third dimension; generating a second combined regional feature map by processing the first combined regional feature map using one or more convolutional layers; and generating, for each of the plurality of regions of interest, information associated with an object instance based on a portion of the second combined regional feature map associated with that region of interest. 2 . The method of claim 1 , wherein the first dimension and the second dimension of the first combined regional feature map are equal to the first dimension and the second dimension of each of the plurality of regional feature maps, respectively. 3 . The method of claim 2 , wherein the third dimension of the first combined regional feature map is equal to or larger than a combination of the respective third dimensions of the plurality of regional feature maps. 4 . The method of claim 3 , wherein the third dimension of each of the plurality of regional feature maps corresponds to height size or width size; wherein the first dimension or the second dimension corresponds to channel size. 5 . The method of claim 1 , wherein the first combined regional feature map includes the plurality of regional feature maps with paddings inserted between adjacent pairs of the plurality of regional feature maps. 6 . The method of claim 5 , wherein a size of the padding between each adjacent pair of the plurality of regional feature maps is at least as wide as a kernel size used by the one or more convolutional layers. 7 . The method of claim 1 , wherein the processing of the first combined regional feature map is performed using a neural processing engine configured for performing convolutional operations on three-dimensional tensors. 8 . The method of claim 1 , wherein the information associated with the object instance is an instance segmentation mask, a keypoint mask, or a bounding box. 9 . A system comprising: one or more processors and one or more computer-readable non-transitory storage media coupled to one or more of the processors, the one or more computer-readable non-transitory storage media comprising instructions operable when executed by one or more of the processors to cause the system to perform operations comprising: accessing an image; generating a feature map for the image using a neural network; identifying a plurality of regions of interest in the feature map; generating a plurality of regional feature maps for the plurality of regions of interest, respectively, wherein each of the plurality of regional feature maps has a first dimension, a second dimension, and a third dimension; generating a first combined regional feature map by combining the plurality of regional feature maps, wherein the combined regional feature map has a first dimension, a second dimension, and a third dimension; generating a second combined regional feature map by processing the first combined regional feature map using one or more convolutional layers; and generating, for each of the plurality of regions of interest, information associated with an object instance based on a portion of the second combined regional feature map associated with that region of interest. 10 . The system of claim 9 , wherein the first dimension and the second dimension of the first combined regional feature map are equal to the first dimension and the second dimension of each of the plurality of regional feature maps, respectively. 11 . The system of claim 10 , wherein the third dimension of the first combined regional feature map is equal to or larger than a combination of the respective third dimensions of the plurality of regional feature maps. 12 . The system of claim 11 , wherein the third dimension of each of the plurality of regional feature maps corresponds to height size or width size; and wherein the first dimension or the second dimension corresponds to channel size. 13 . The system of claim 9 , wherein the first combined regional feature map includes the plurality of regional feature maps with paddings inserted between adjacent pairs of the plurality of regional feature maps. 14 . The system of claim 13 , wherein a size of the padding between each adjacent pair of the plurality of regional feature maps is at least as wide as a kernel size used by the one or more convolutional layers. 15 . The system of claim 9 , wherein the processing of the first combined regional feature map is performed using a neural processing engine configured for performing convolutional operations on three-dimensional tensors. 16 . One or more computer-readable non-transitory storage media embodying software that is operable when executed to cause one or more processors to perform operations comprising: accessing an image; generating a feature map for the image using a neural network; identifying a plurality of regions of interest in the feature map; generating a plurality of regional feature maps for the plurality of regions of interest, respectively, wherein each of the plurality of regional feature maps has a first dimension, a second dimension, and a third dimension; generating a first combined regional feature map by combining the plurality of regional feature maps, wherein the combined regional feature map has a first dimension, a second dimension, and a third dimension; generating a second combined regional feature map by processing the first combined regional feature map using one or more convolutional layers; and generating, for each of the plurality of regions of interest, information associated with an object instance based on a portion of the second combined regional feature map associated with that region of interest. 17 . The media of claim 16 , wherein the first dimension and the second dimension of the first combined regional feature map are equal to the first dimension and the second dimension of each of the plurality of regional feature maps, respectively. 18 . The media of claim 17 , wherein the third dimension of the first combined regional feature map is equal to or larger than a combination of the respective third dimensions of the plurality of regional feature maps. 19 . The media of claim 18 , wherein the third dimension of each of the plurality of regional feature maps corresponds to height size or width size; wherein the first dimension or the second dimension corresponds to channel size. 20 . The media of claim 16 , wherein the processing of the first combined regional feature map is performed using a neural processing engine configured for performing convolutional operations on three-dimensional tensors.

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • involving models · CPC title

  • G06T7/73Primary

    using feature-based methods · CPC title

  • Knowledge engineering; Knowledge acquisition · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019171903A1 cover?
In one embodiment, a system may access an image and generate a feature map for the image using a neural network. The system may identify regions of interest in the feature map. Regional feature maps may be generated for the regions of interest, respectively. Each of the regional feature maps has a first, a second, and a third dimension. The system may generate a first combined regional feature …
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/73. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).