Neural network for object detection in images

US11645834B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11645834-B2
Application numberUS-202016949856-A
CountryUS
Kind codeB2
Filing dateNov 17, 2020
Priority dateNov 1, 2016
Publication dateMay 9, 2023
Grant dateMay 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more bounding boxes.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training a multilayer object model, the method comprising: accessing, by one or more hardware processors, a set of training images, each training image depicting a known object of interest; identifying, by the one or more hardware processors, a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining, by the one or more hardware processors, whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling, by the one or more hardware processors, the resolution of the given bounding box to match the specified box resolution by: identifying a center point of the given bounding box; and modifying the given bounding box by cropping at least one portion of the given bounding box, outside of the specified box resolution, with respect to the center point, the specified box resolution being determined based on at least one of a type of the known object of interest or information within a label of the given bounding box; initializing, by the one or more hardware processors, one or more model parameters of the multilayer object model; and iteratively adjusting, by the one or more hardware processors, the one or more model parameters while using the multilayer object model to detect the known object of interest in the set of bounding boxes, the iteratively adjusting being performed until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 2. The method of claim 1 , wherein the averaged loss function value is obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 3. The method of claim 1 , wherein each individual bounding box in the set of bounding boxes has a label. 4. The method of claim 1 , wherein each individual bounding box in the set of bounding boxes has a set of coordinates that identify a location within a training image of the set of training images. 5. The method of claim 1 , wherein the iteratively adjusting the one or more model parameters comprises iteratively adjusting the one or more model parameters using a gradient descent algorithm. 6. The method of claim 5 , wherein the gradient descent algorithm is calculated using back propagation. 7. The method of claim 1 , wherein the initializing the one or more model parameters of the multilayer object model comprises initializing the one or more model parameters using a Gaussian distribution. 8. The method of claim 1 , wherein at least one training image of the set of training images comprises data or be associated with data that indicates at least one of an identity, a class, the type, or another identifying information for the known object of interest in the at least one training image. 9. The method of claim 1 , wherein at least one training image of the set of training images comprises data or be associated that identifies a location of at least a portion of the known object of interest within the at least one training image. 10. A system comprising: one or more processors; and a processor-readable storage device coupled to the one or more processors, the processor-readable storage device storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations to train a multilayer object model, the operations comprising: accessing a set of training images, each training image depicting a known object of interest; identifying a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling the resolution of the given bounding box to match the specified box resolution by: identifying a center point of the given bounding box; and modifying the given bounding box by cropping at least one portion of the given bounding box, outside of the specified box resolution, with respect to the center point, the specified box resolution being determined based on at least one of a type of the known object of interest or information within a label of the given bounding box; initializing one or more model parameters of the multilayer object model; and iteratively adjusting the one or more model parameters while using the multilayer object model to detect the known object of interest in the set of bounding boxes, the iteratively adjusting being performed until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 11. The system of claim 10 , wherein the averaged loss function value is obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 12. The system of claim 10 , wherein each individual bounding box in the set of bounding boxes has a label. 13. The system of claim 10 , wherein each individual bounding box in the set of bounding boxes has a set of coordinates that identify a location within a training image of the set of training images. 14. The system of claim 10 , wherein the iteratively adjusting the one or more model parameters comprises iteratively adjusting the one or more model parameters using a gradient descent algorithm. 15. The system of claim 14 , wherein the gradient descent algorithm is calculated using back propagation. 16. The system of claim 10 , wherein the initializing the one or more model parameters of the multilayer object model comprises initializing the one or more model parameters using a Gaussian distribution. 17. The system of claim 10 , wherein at least one training image of the set of training images comprises data or be associated with data that indicates at least one of an identity, a class, the type, or another identifying information for the known object of interest in the at least one training image. 18. The system of claim 10 , wherein at least one training image of the set of training images comprises data or be associated that identifies a location of at least a portion of the known object of interest within the at least one training image. 19. A processor-readable storage device storing processor-executable instructions that, when executed by one or more processors of a machine, cause the machine to perform operations to train a multilayer object model, the operations comprising: accessing a set of training images, each training image depicting a known object of interest; identifying a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling the resoluti

Assignees

Inventors

Classifications

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06V10/82Primary

    using neural networks · CPC title

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645834B2 cover?
Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more boun…
Who is the assignee on this patent?
Snap Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).