What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method for acquiring bounding box corresponding to an object in an image by using convolutional neural network including tracking network and computing device using the same

US9946960B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9946960-B1
Application number	US-201715783442-A
Country	US
Kind code	B1
Filing date	Oct 13, 2017
Priority date	Oct 13, 2017
Publication date	Apr 17, 2018
Grant date	Apr 17, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for acquiring a bounding box corresponding to an object is provided. The method includes steps of: (a) acquiring proposal boxes; (b) selecting specific proposal box among the proposal boxes by referring to (i) a result of comparing distance between a reference bounding box and the proposal boxes and/or (ii) a result of comparing score which indicates whether the proposal boxes includes the object, and then setting the specific proposal box as a starting area of a tracking box; (c) determining a specific area of the current frame as a target area of the tracking box by using the mean shift tracking algorithm; and (d) allowing a pooling layer to generate a pooled feature map by applying pooling operation to an area corresponding to the specific area and then allowing a FC layer to acquire a bounding box by applying regression operation to the pooled feature map.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for acquiring at least one bounding box corresponding to at least one object in a test image by using a CNN including a tracking network, comprising steps of: (a) a testing device acquiring or supporting another device to acquire multiple proposal boxes, if feature map is generated as a result of applying convolution operation to the test image as a current frame, and then information on the multiple proposal boxes by applying certain operation to the feature map by a Region Proposal Network (RPN) is outputted; (b) the testing device selecting or supporting another device to select at least one specific proposal box among the multiple proposal boxes by referring to at least either of (i) a result of comparing each distance between a reference bounding box of the object in a previous frame and each of the multiple proposal boxes and (ii) a result of comparing each of scores as probability values which indicates whether each of the proposal boxes includes the object, and then setting or another device to set the specific proposal box as a starting area of a tracking box, wherein the starting area is used for a mean shift tracking algorithm; (c) the testing device determining or supporting another device to determine a specific area of the current frame having information on probability similar to that corresponding to pixel data of the object in the previous frame as a target area of the tracking box by using the mean shift tracking algorithm; and (d) the testing device allowing a pooling layer to generate a pooled feature map by applying pooling operation to an area in the feature map corresponding to the specific area and then allowing a FC layer to acquire a bounding box by applying regression operation to the pooled feature map. 2. The method of claim 1 , wherein, at the step of (c), the information on probability corresponding to pixel data of the object in the previous frame is a histogram corresponding to pixel data of the bounding box in the previous frame. 3. The method of claim 1 , further comprising a step of: (e) the testing device determining the bounding box as a reference bounding box to be used for a tracking box of the object to be located in a next frame. 4. The method of claim 1 , wherein, at the step of (b), if the number of the object is plural, the testing device selects or supports another device to select the specific proposal boxes among the multiple proposal boxes by referring to at least either of (i) the result of comparing each of distances between the reference bounding box of the object in the previous frame and each of the multiple proposal boxes and (ii) the result of comparing each of the scores as probability values which indicates whether each of the proposal boxes includes the object, and then setting or another device to set each of the specific proposal boxes as each starting area of each of the tracking boxes. 5. The method of claim 1 , wherein, at the step of (b), a distance between the reference bounding box of the object located in the previous frame and each of the multiple proposal boxes is L2 distance between the center coordinate of the reference bounding box and that of each of the multiple proposal boxes. 6. The method of claim 1 , wherein, on conditions that a learning device has completed processes of (i) allowing convolutional layers to acquire a feature map for training from a training image including an object for training, (ii) allowing the RPN to acquire one or more proposal boxes for training corresponding to the object for training in the training image, (iii) allowing the pooling layer to generate a pooled feature map for training corresponding to the proposal boxes for training by applying pooling operation, (iv) allowing the FC layer to acquire information on pixel data of a bounding box for training by applying the regression operation to the pooled feature map for training, and (v) allowing a loss layer to acquire comparative data by comparing between the information on pixel data of the bounding box in the training image and that in a GT image, thereby adjusting at least one parameter of the CNN by using the comparative data during backpropagation process, the testing device performs the steps of (a) to (d). 7. The method of claim 1 , wherein, at the step of (d), the testing device acquires or supports another device to acquire the bounding box with a size being adjusted to correspond to the object in the test image by processes of generating the pooled feature map and then applying the regression operation through the FC layer. 8. A method for acquiring bounding boxes corresponding to objects in a test image by using a CNN including a tracking network and a detection network, comprising steps of: (a) a testing device acquiring or supporting another device to acquire multiple proposal boxes, if feature map is generated as a result of applying convolution operation to the test image as a current frame, and then information on the multiple proposal boxes by applying certain operation to the feature map by a Region Proposal Network (RPN) is outputted; (b) the testing device (b-1) selecting or supporting another device to select at least one specific proposal box among the multiple proposal boxes by referring to at least either of (i) a result of comparing each distance between a reference bounding box of the object in a previous frame and each of the multiple proposal boxes and (ii) a result of comparing each of scores as probability values which indicates whether each of the proposal boxes includes the object, and then setting or another device to set the specific proposal box as a starting area of a tracking box, wherein the starting area is used for a mean shift tracking algorithm; and (b-2) setting or supporting another device to set at least some of the proposal boxes which have not been set as the tracking box among the multiple proposal boxes as multiple untracked boxes; and (c) (c-1) the testing device, after the step of (b-1), determining or supporting another device to determine a specific area of the current frame having information on probability similar to that corresponding to pixel data of the object in the previous frame as a target area of the tracking box by using the mean shift tracking algorithm; and allowing a first pooling layer to generate a first pooled feature map by applying pooling operation to an area in the feature map corresponding to the specific area and then allowing a FC layer to acquire a first bounding box by applying regression operation to the first pooled feature map; and (c-2) the testing device, after the step of (b-2), allowing a second pooling layer to generate a second pooled feature map by applying pooling operation to an area on the feature map corresponding to at least one of the multiple untracked boxes; and, if the FC layer detects a new object by applying classification operation to the second pooled feature map, allowing the FC layer to acquire a second bounding box by applying regression operation to the second pooled feature map. 9. The method of claim 8 , wherein, at the step of (c-2), the testing device determines the second bounding box corresponding to the new object as a reference bounding box to be used for a tracking box of the new object included in a next frame. 10. The method of claim 8 , wherein, at the step of (b-2), at least one specific untracked box is selected among the multiple untracked boxes by referring to at least either of (i) each of L2 distances between the reference bounding box acquired from the previous frame and each of the multiple untracked boxes and (ii) each of scores as probability values which indicates whether each of the multiple untracked boxes includes

Assignees

Stradvision Inc

Inventors

Classifications

G06V10/82
using neural networks · CPC title
G06V10/764
using classification, e.g. of video objects · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06N3/084Primary
Backpropagation, e.g. using gradient descent · CPC title
G06T7/20
Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

Patent family

Related publications grouped by family.

View patent family 61872587

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9946960B1 cover?: A method for acquiring a bounding box corresponding to an object is provided. The method includes steps of: (a) acquiring proposal boxes; (b) selecting specific proposal box among the proposal boxes by referring to (i) a result of comparing distance between a reference bounding box and the proposal boxes and/or (ii) a result of comparing score which indicates whether the proposal boxes includes…
Who is the assignee on this patent?: Stradvision Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).