Neural network for object detection in images
US-10872276-B2 · Dec 22, 2020 · US
US11645834B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11645834-B2 |
| Application number | US-202016949856-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 17, 2020 |
| Priority date | Nov 1, 2016 |
| Publication date | May 9, 2023 |
| Grant date | May 9, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, devices, media, and methods are presented for identifying and categorically labeling objects within a set of images. The systems and methods receive an image depicting an object of interest, detect at least a portion of the object of interest within the image using a multilayer object model, determine context information, and identify the object of interest included in two or more bounding boxes.
Opening claim text (preview).
What is claimed is: 1. A method for training a multilayer object model, the method comprising: accessing, by one or more hardware processors, a set of training images, each training image depicting a known object of interest; identifying, by the one or more hardware processors, a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining, by the one or more hardware processors, whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling, by the one or more hardware processors, the resolution of the given bounding box to match the specified box resolution by: identifying a center point of the given bounding box; and modifying the given bounding box by cropping at least one portion of the given bounding box, outside of the specified box resolution, with respect to the center point, the specified box resolution being determined based on at least one of a type of the known object of interest or information within a label of the given bounding box; initializing, by the one or more hardware processors, one or more model parameters of the multilayer object model; and iteratively adjusting, by the one or more hardware processors, the one or more model parameters while using the multilayer object model to detect the known object of interest in the set of bounding boxes, the iteratively adjusting being performed until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 2. The method of claim 1 , wherein the averaged loss function value is obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 3. The method of claim 1 , wherein each individual bounding box in the set of bounding boxes has a label. 4. The method of claim 1 , wherein each individual bounding box in the set of bounding boxes has a set of coordinates that identify a location within a training image of the set of training images. 5. The method of claim 1 , wherein the iteratively adjusting the one or more model parameters comprises iteratively adjusting the one or more model parameters using a gradient descent algorithm. 6. The method of claim 5 , wherein the gradient descent algorithm is calculated using back propagation. 7. The method of claim 1 , wherein the initializing the one or more model parameters of the multilayer object model comprises initializing the one or more model parameters using a Gaussian distribution. 8. The method of claim 1 , wherein at least one training image of the set of training images comprises data or be associated with data that indicates at least one of an identity, a class, the type, or another identifying information for the known object of interest in the at least one training image. 9. The method of claim 1 , wherein at least one training image of the set of training images comprises data or be associated that identifies a location of at least a portion of the known object of interest within the at least one training image. 10. A system comprising: one or more processors; and a processor-readable storage device coupled to the one or more processors, the processor-readable storage device storing processor-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations to train a multilayer object model, the operations comprising: accessing a set of training images, each training image depicting a known object of interest; identifying a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling the resolution of the given bounding box to match the specified box resolution by: identifying a center point of the given bounding box; and modifying the given bounding box by cropping at least one portion of the given bounding box, outside of the specified box resolution, with respect to the center point, the specified box resolution being determined based on at least one of a type of the known object of interest or information within a label of the given bounding box; initializing one or more model parameters of the multilayer object model; and iteratively adjusting the one or more model parameters while using the multilayer object model to detect the known object of interest in the set of bounding boxes, the iteratively adjusting being performed until a change in averaged loss function values resulting from iterations of the one or more model parameters falls below a change threshold. 11. The system of claim 10 , wherein the averaged loss function value is obtained for two or more instances of a training image of the set of training images, each of the two or more instances of the training image having distinct resolutions. 12. The system of claim 10 , wherein each individual bounding box in the set of bounding boxes has a label. 13. The system of claim 10 , wherein each individual bounding box in the set of bounding boxes has a set of coordinates that identify a location within a training image of the set of training images. 14. The system of claim 10 , wherein the iteratively adjusting the one or more model parameters comprises iteratively adjusting the one or more model parameters using a gradient descent algorithm. 15. The system of claim 14 , wherein the gradient descent algorithm is calculated using back propagation. 16. The system of claim 10 , wherein the initializing the one or more model parameters of the multilayer object model comprises initializing the one or more model parameters using a Gaussian distribution. 17. The system of claim 10 , wherein at least one training image of the set of training images comprises data or be associated with data that indicates at least one of an identity, a class, the type, or another identifying information for the known object of interest in the at least one training image. 18. The system of claim 10 , wherein at least one training image of the set of training images comprises data or be associated that identifies a location of at least a portion of the known object of interest within the at least one training image. 19. A processor-readable storage device storing processor-executable instructions that, when executed by one or more processors of a machine, cause the machine to perform operations to train a multilayer object model, the operations comprising: accessing a set of training images, each training image depicting a known object of interest; identifying a set of bounding boxes within the set of training images, each individual bounding box in the set of bounding boxes having a resolution; for each given bounding box in the set of bounding boxes: determining whether the resolution of the given bounding box exceeds a specified box resolution; and in response to determining that the resolution of the given bounding box exceeds the specified box resolution, rescaling the resoluti
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.