Neural network for object detection in images
US-10346723-B2 · Jul 9, 2019 · US
US10872276B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10872276-B2 |
| Application number | US-201916424404-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 28, 2019 |
| Priority date | Nov 1, 2016 |
| Publication date | Dec 22, 2020 |
| Grant date | Dec 22, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more bounding boxes.
Opening claim text (preview).
What is claimed is: 1. A device implemented method for image recognition; the method comprising: accessing, using one or more processors of the device coupled to a memory of the device, an image depicting an object of interest; generating, by the one or more processors, a set of bounding boxes comprising a first bounding box and a second bounding box within the image, the generating the set of bounding boxes comprising determining an initial resolution for the first bounding box based on training of a multilayer neural network model and at least of a pixel count or a measurement parameter, the measurement parameter based on at least one of color saturation, brightness, grayscale or lightness from the image, the first bounding box being associated with a first layer of the multilayer neural network model, the second bounding box being associated with a second layer of the multilayer neural network model; determining whether the initial resolution of the first bounding box exceeds a specified box resolution; in response to determining; that the initial resolution of the first bounding box exceeds the specified box resolution, resealing the resolution of the first bounding box to match the specified box resolution by identifying a center point of the first bounding box and cropping a portion of the first bounding box outside of the specified box resolution with respect to the center point; detecting, by the one or more processors; at least a portion of the object of interest within the first bounding box; extracting, by the one or more processors, context information from the second bounding box; wherein a layer output of the second layer includes the extracted context information; and identifying, by the one or more processors, the entire object of interest based on the at least detected portion of the object of interest and the context information; using the multilayer neural network model, by passing the layer output of the second layer including the extracted context information to the first layer using a deconvolution layer of the multilayer neural network model. 2. The method of claim 1 , wherein generating the set of bounding boxes comprises: identifying a set of coordinates within the image, the set of coordinates including an indication of one or more boundaries for the image; determining a set of sizes and a set of aspect ratios for the set of bounding boxes; determining a distribution of bounding boxes to encompass each coordinate of the set of coordinates in at least one bounding box of the set of bounding boxes; and generating the set of bounding boxes to distribute the set of bounding boxes uniformly over the image, wherein each bounding box of the set of bounding boxes is generated with a size included in the set of sizes and an aspect ratio included in the set of aspect ratios. 3. The method of claim 1 , wherein the first bounding box has a first size and a first aspect ratio and the second bounding box has a second size and a second aspect ratio, and wherein the first size is distinct from the second size and the first aspect ratio is distinct from the second aspect ratio. 4. The method of claim 1 , wherein the first layer is associated with a first scale corresponding to the first bounding box, and wherein the second layer is associated with a second scale corresponding to the second bounding box. 5. The method of claim 4 , wherein the first layer generates a first confidence score and a first set of coordinates for the object of interest depicted within the first bounding box, and the second layer generates a second confidence score and a second set of coordinates for a background depicted within the second bounding box. 6. The method of claim 1 , wherein training of the multilayer neural network model comprises: accessing a set of training images, each training image depicting a known object of interest; identifying a set of training bounding boxes within the set of training images, each bounding box within the set of training bounding boxes having a set of coordinates identifying a location within a training image, an initial resolution, and a label; determining the initial resolution for a training bounding box based on at least a pixel count or measurement parameter, the measurement parameter based on at least one of color, saturation, brightness grayscale, or lightness from the training image; determining whether the initial resolution of the training; bounding box exceeds a specified box resolution; in response to determining that the initial resolution of the training bounding box exceeds the specified box resolution, resealing the resolution of the training bounding box to match the specified box resolution by identifying a center point of the training bounding box and by cropping portions of the training bounding box outside of the specified box resolution with respect to the center point; initializing one or more model parameters, wherein at least one of the model parameters is based on a value of the measurement parameter of the known object of interest detected during the training; and iteratively adjusting the one or more model parameters until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 7. The method of claim 1 , wherein training of the multilayer neural network model comprises: accessing a set of training images, each training image depicting a known object of interest; and detecting the known objects of interest within the set of training images using the multilayer neural network model, the detection performed with one or more layers of the multi layer neural network model set at a first resolution; and one or more layers of the multilayer neural network model set at a second resolution. 8. The method of claim 7 , wherein detecting the known objects of interest further comprises: iteratively adjusting one or more model parameters until a change in averaged loss function values falls below a change threshold, the averaged loss function obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 9. The method of claim 1 , further comprising training a set of layers of the multilayer neural network model by: accessing a set of training images, each training image depicting a known object of interest and containing at least one bounding box comprising a tuple indicating coordinates of the bounding box within a training image and a classification label for the known object of interest; and iteratively initializing one or more layers of the set of layers of the multilayer neural network model and adjusting one or more parameters of the one or more layers until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 10. A system comprising: one or more processors; and a processor-readable storage device coupled to the one or more processors, the processor-readable storage device storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: accessing an image depicting an object of interest; generating a set of bounding boxes comprising a first bounding box and a second bounding box within the image, the generating the set of bounding boxes comprising determining an initial resolution for the first bounding box based on training of a multilayer neural network model and at least of a pixel count or a measurement parameter, the measurement parameter based on at least one of color, saturation, bri
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Classification techniques · CPC title
Combinations of networks · CPC title
based on distances to training or reference patterns · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.