Learning method and learning device for object detector with hardware optimization based on CNN for detection at distance or military purpose using image concatenation, and testing method and testing device using the same

US10387752B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10387752-B1
Application numberUS-201916254279-A
CountryUS
Kind codeB1
Filing dateJan 22, 2019
Priority dateJan 22, 2019
Publication dateAug 20, 2019
Grant dateAug 20, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for learning parameters of an object detector with hardware optimization based on a CNN for detection at distance or military purpose using an image concatenation is provided. The CNN can be redesigned when scales of objects change as a focal length or a resolution changes depending on the KPI. The method includes steps of: (a) concatenating n manipulated images which correspond to n target regions; (b) instructing an RPN to generate first to n-th object proposals in the n manipulated images by using an integrated feature map, and instructing a pooling layer to apply pooling operations to regions, corresponding to the first to the n-th object proposals, on the integrated feature map; and (c) instructing an FC loss layer to generate first to n-th FC losses by referring to the object detection information, outputted from an FC layer.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for learning parameters of an object detector based on a CNN using an image concatenation, comprising steps of: (a) a learning device, if at least one training image is acquired, (i) instructing a target region estimating network to estimate a first target region to an n-th target region on the training image or its one or more resized images, wherein one or more corresponding target objects are estimated as located on each of the first target region to the n-th target region, (ii) instructing an image-manipulating network to generate a first manipulated image to an n-th manipulated image, from the training image or its resized images, each of which corresponds to each of the first target region to the n-th target region, and (iii) generating an integrated training image by concatenating the first manipulated image to the n-th manipulated image; (b) the learning device (i) instructing one or more convolutional layers to generate at least one integrated feature map by applying one or more convolution operations to the integrated training image, (ii) instructing an RPN to generate each of first object proposals to n-th object proposals, corresponding to one or more objects, in each of the first manipulated image to the n-th manipulated image by using the integrated feature map, (iii) instructing a pooling layer to apply one or more pooling operations to each region, corresponding to each of the first object proposals to the n-th object proposals, on the integrated feature map, to thereby generate at least one pooled integrated feature map, and (iv) instructing an FC layer to apply at least one fully connected operation to the pooled integrated feature map, to thereby generate first object detection information to n-th object detection information corresponding to the objects; and (c) the learning device instructing at least one FC loss layer to generate one or more first FC losses to one or more n-th FC losses by referring to the first object detection information to the n-th object detection information and their corresponding GTs, to thereby adjust at least part of parameters of the FC layer and the convolutional layers by backpropagating the first FC losses to the n-th FC losses. 2. The method of claim 1 , wherein, after the step of (b), the learning device instructs at least one RPN loss layer to generate one or more first RPN losses to one or more n-th RPN losses by referring to the first object proposals to the n-th object proposals and their corresponding GTs, to thereby adjust at least part of parameters of the RPN by backpropagating the first RPN losses to the n-th RPN losses. 3. The method of claim 1 , wherein, at the step of (a), the learning device instructs the image-manipulating network to adjust at least one of widths and lengths of the first manipulated image to the n-th manipulated image to be identical, and concatenates the first adjusted manipulated image to the n-th adjusted manipulated image in a direction of the widths or the lengths which are adjusted to be identical. 4. The method of claim 3 , wherein the learning device instructs the image-manipulating network to add at least one zero padding region in-between each pair comprised of two neighboring adjusted manipulated images which are concatenated, among the first adjusted manipulated image to the n-th adjusted manipulated image. 5. The method of claim 4 , wherein the integrated training image is reduced by a ratio of 1/S by multiple convolution operations of the convolutional layers, and wherein, if a maximum size of each kernel of each of the convolutional layers is K×K, a distance in-between said each pair comprised of the two neighboring adjusted manipulated images is determined as S × ( K - 1 ) 2 . 6. The method of claim 1 , wherein, at the step of (a), the learning device instructs the target region estimating network to calculate each scale histogram for each of the training image or its resized images and estimate the first target region to the n-th target region corresponding to scale proposals where the corresponding target objects are estimated as located, by referring to the scale histogram. 7. The method of claim 1 , wherein, at the step of (a), the learning device instructs the image-manipulating network to generate the first manipulated image to the n-th manipulated image by cropping one or more regions corresponding to the first target region to the n-th target region on the training image or its resized images, or instructs the image-manipulating network to generate the first manipulated image to the n-th manipulated image by cropping and resizing one or more regions corresponding to the first target region to the n-th target region on the training image or its resized images. 8. The method of claim 1 , wherein the first target region to the n-th target region correspond to multiple different target objects among the target objects in the training image, or correspond to at least one identical target object in the training image and its resized images. 9. A method for testing an object detector based on a CNN using an image concatenation, comprising steps of: (a) on condition that a learning device (1) (i) has instructed a target region estimating network to estimate a first target region for training to an n-th target region for training on at least one training image or its one or more resized images for training, wherein one or more corresponding target objects for training are estimated as located on each of the first target region for training to the n-th target region for training, (ii) has instructed an image-manipulating network to generate a first manipulated image for training to an n-th manipulated image for training, from the training image or its resized images for training, each of which corresponds to each of the first target region for training to the n-th target region for training, and (iii) has generated an integrated training image by concatenating the first manipulated image for training to the n-th manipulated image for training, (2) (i) has instructed one or more convolutional layers to generate at least one integrated feature map for training by applying one or more convolution operations to the integrated training image, (ii) has instructed an RPN to generate each of first object proposals for training to n-th object proposals for training, corresponding to one or more objects for training, in each of the first manipulated image for training to the n-th manipulated image for training by using the integrated feature map for training, (iii) has instructed a pooling layer to apply one or more pooling operations to each region, corresponding to each of the first object proposals for training to the n-th object proposals for training, on the integrated feature map for training, to thereby generate at least one pooled integrated feature map for training, and (iv) has instructed an FC layer to apply at least one fully connected operation to the pooled integrated feature map for training, to thereby generate first object detection information for training to n-th object detection information for training corresponding to the objects for training, and (3) has instructed at least one FC loss layer to generate one or more first FC losses to one or more n-th FC losses by referring to the first object detection

Assignees

Inventors

Classifications

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads · CPC title

  • using neural networks · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10387752B1 cover?
A method for learning parameters of an object detector with hardware optimization based on a CNN for detection at distance or military purpose using an image concatenation is provided. The CNN can be redesigned when scales of objects change as a focal length or a resolution changes depending on the KPI. The method includes steps of: (a) concatenating n manipulated images which correspond to n t…
Who is the assignee on this patent?
Stradvision Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).