Neural network for object detection in images

US10872276B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10872276-B2
Application numberUS-201916424404-A
CountryUS
Kind codeB2
Filing dateMay 28, 2019
Priority dateNov 1, 2016
Publication dateDec 22, 2020
Grant dateDec 22, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more bounding boxes.

First claim

Opening claim text (preview).

What is claimed is: 1. A device implemented method for image recognition; the method comprising: accessing, using one or more processors of the device coupled to a memory of the device, an image depicting an object of interest; generating, by the one or more processors, a set of bounding boxes comprising a first bounding box and a second bounding box within the image, the generating the set of bounding boxes comprising determining an initial resolution for the first bounding box based on training of a multilayer neural network model and at least of a pixel count or a measurement parameter, the measurement parameter based on at least one of color saturation, brightness, grayscale or lightness from the image, the first bounding box being associated with a first layer of the multilayer neural network model, the second bounding box being associated with a second layer of the multilayer neural network model; determining whether the initial resolution of the first bounding box exceeds a specified box resolution; in response to determining; that the initial resolution of the first bounding box exceeds the specified box resolution, resealing the resolution of the first bounding box to match the specified box resolution by identifying a center point of the first bounding box and cropping a portion of the first bounding box outside of the specified box resolution with respect to the center point; detecting, by the one or more processors; at least a portion of the object of interest within the first bounding box; extracting, by the one or more processors, context information from the second bounding box; wherein a layer output of the second layer includes the extracted context information; and identifying, by the one or more processors, the entire object of interest based on the at least detected portion of the object of interest and the context information; using the multilayer neural network model, by passing the layer output of the second layer including the extracted context information to the first layer using a deconvolution layer of the multilayer neural network model. 2. The method of claim 1 , wherein generating the set of bounding boxes comprises: identifying a set of coordinates within the image, the set of coordinates including an indication of one or more boundaries for the image; determining a set of sizes and a set of aspect ratios for the set of bounding boxes; determining a distribution of bounding boxes to encompass each coordinate of the set of coordinates in at least one bounding box of the set of bounding boxes; and generating the set of bounding boxes to distribute the set of bounding boxes uniformly over the image, wherein each bounding box of the set of bounding boxes is generated with a size included in the set of sizes and an aspect ratio included in the set of aspect ratios. 3. The method of claim 1 , wherein the first bounding box has a first size and a first aspect ratio and the second bounding box has a second size and a second aspect ratio, and wherein the first size is distinct from the second size and the first aspect ratio is distinct from the second aspect ratio. 4. The method of claim 1 , wherein the first layer is associated with a first scale corresponding to the first bounding box, and wherein the second layer is associated with a second scale corresponding to the second bounding box. 5. The method of claim 4 , wherein the first layer generates a first confidence score and a first set of coordinates for the object of interest depicted within the first bounding box, and the second layer generates a second confidence score and a second set of coordinates for a background depicted within the second bounding box. 6. The method of claim 1 , wherein training of the multilayer neural network model comprises: accessing a set of training images, each training image depicting a known object of interest; identifying a set of training bounding boxes within the set of training images, each bounding box within the set of training bounding boxes having a set of coordinates identifying a location within a training image, an initial resolution, and a label; determining the initial resolution for a training bounding box based on at least a pixel count or measurement parameter, the measurement parameter based on at least one of color, saturation, brightness grayscale, or lightness from the training image; determining whether the initial resolution of the training; bounding box exceeds a specified box resolution; in response to determining that the initial resolution of the training bounding box exceeds the specified box resolution, resealing the resolution of the training bounding box to match the specified box resolution by identifying a center point of the training bounding box and by cropping portions of the training bounding box outside of the specified box resolution with respect to the center point; initializing one or more model parameters, wherein at least one of the model parameters is based on a value of the measurement parameter of the known object of interest detected during the training; and iteratively adjusting the one or more model parameters until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 7. The method of claim 1 , wherein training of the multilayer neural network model comprises: accessing a set of training images, each training image depicting a known object of interest; and detecting the known objects of interest within the set of training images using the multilayer neural network model, the detection performed with one or more layers of the multi layer neural network model set at a first resolution; and one or more layers of the multilayer neural network model set at a second resolution. 8. The method of claim 7 , wherein detecting the known objects of interest further comprises: iteratively adjusting one or more model parameters until a change in averaged loss function values falls below a change threshold, the averaged loss function obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 9. The method of claim 1 , further comprising training a set of layers of the multilayer neural network model by: accessing a set of training images, each training image depicting a known object of interest and containing at least one bounding box comprising a tuple indicating coordinates of the bounding box within a training image and a classification label for the known object of interest; and iteratively initializing one or more layers of the set of layers of the multilayer neural network model and adjusting one or more parameters of the one or more layers until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 10. A system comprising: one or more processors; and a processor-readable storage device coupled to the one or more processors, the processor-readable storage device storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: accessing an image depicting an object of interest; generating a set of bounding boxes comprising a first bounding box and a second bounding box within the image, the generating the set of bounding boxes comprising determining an initial resolution for the first bounding box based on training of a multilayer neural network model and at least of a pixel count or a measurement parameter, the measurement parameter based on at least one of color, saturation, bri

Assignees

Inventors

Classifications

  • G06V10/82Primary

    using neural networks · CPC title

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

  • Classification techniques · CPC title

  • Combinations of networks · CPC title

  • based on distances to training or reference patterns · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10872276B2 cover?
Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more boun…
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 22 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).