Fast template-based tracking

US9773192B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9773192-B2
Application numberUS-201514732738-A
CountryUS
Kind codeB2
Filing dateJun 7, 2015
Priority dateJun 7, 2015
Publication dateSep 26, 2017
Grant dateSep 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques to identify and track a pre-identified region-of-interest (ROI) through a temporal sequence of frames/images are described. In general, a down-sampled color gradient (edge map) of an arbitrary sized ROI from a prior frame may be used to generate a small template. This initial template may be used to identify a region of a new or current frame that may be overscan and used to create a current frame's edge map. By comparing the prior frame's template to the current frame's edge map, a cost value or image may be found and used to identify the current frame's ROI center. The size of the current frame's ROI may be found by varying the size of putative new ROIs and testing for their congruence with the prior frame's template. Subsequent ROI's for subsequent frames may be identified to, effectively, track an arbitrarily sized ROI through a sequence of video frames.

First claim

Opening claim text (preview).

The invention claimed is: 1. An object tracking method, comprising: receiving an initial frame from a temporal sequence of frames, the initial frame having an initial region-of-interest (ROI), every ROI having a size and location; determining an initial template of the initial frame based on the initial ROI and a specified size; receiving a first frame from the temporal sequence of frames, the first frame arriving later in the temporal sequence of frames than the initial frame; identifying a first region of the first frame based on the initial ROI; finding a first plurality of first metric values based on the first region and a cost function; determining a first location of a first ROI of the first frame based on the plurality of first metric values; determining a second plurality of putative ROIs for the first frame, each putative ROI having a different size and centered at the first location; determining a second metric value for each of the putative ROIs; and selecting one of the putative ROIs as the first frame's first ROI based on the second metric values. 2. The method of claim 1 , wherein the initial ROI comprises less than the entire initial frame. 3. The method of claim 2 , wherein determining an initial template comprises: down-sampling the initial ROI to the specified size; and determining a color gradient of the down-sampled initial ROI. 4. The method of claim 3 , further comprising determining a color descriptor of the down-sampled initial ROI. 5. The method of claim 4 , wherein identifying a first region of the first frame comprises: identifying a temporary region of the first frame corresponding to the initial ROI of the initial frame; and overscanning the temporary region. 6. The method of claim 5 , wherein the cost function is based on a congruence between the initial template and each n-by-n sub-region of the first frame's first region, wherein ‘n’ indicates the amount of overscan of the temporary region. 7. The method of claim 6 , wherein determining a second metric value for each of the putative ROIs comprises, for each putative ROI: selecting a region centered about the first location, the region having a size; converting the region to the specified size; finding an edge map of the converted region; and finding a value indicative of the congruence between the initial template and the edge map of the converted region. 8. The method of 1 , further comprising: determining a first edge map of the first frame based on the first ROI; combining the first edge map with a plurality of other edge maps to generate an updated template, wherein each of the plurality of other edge maps corresponds to an ROI of a different frame from the temporal sequence of frames, each of the plurality of other frames arriving earlier in the temporal sequence of frames than the first frame; and using the updated template as the initial template when evaluating a next frame from the temporal sequence of frames, the next frame arriving later in the temporal sequence of frames than the first frame. 9. An object tracking digital image capture unit, comprising: an image sensor; a lens system configured to focus light from a scene onto the image sensor; a memory communicatively coupled to the image sensor and configured to store multiple images from the image sensor; and one or more processors coupled to the lens system and the memory, the one or more processors configured for— receiving an initial frame from a temporal sequence of frames, the initial frame having an initial region-of-interest (ROI), every ROI having a size and location; determining an initial template of the initial frame based on the initial ROI and a specified size; receiving a first frame from the temporal sequence of frames, the first frame arriving later in the temporal sequence of frames than the initial frame; identifying a first region of the first frame based on the initial ROI; finding a first plurality of first metric values based on the first region and a cost function; determining a first location of a first ROI of the first frame based on the plurality of first metric values; determining a second plurality of putative ROIs for the first frame, each putative ROI having a different size and centered at the first location; determining a second metric value for each of the putative ROIs; and selecting one of the putative ROIs as the first frame's first ROI based on the second metric values. 10. The digital image capture unit of claim 9 , wherein the initial ROI comprises less than the entire initial frame. 11. The digital image capture unit of claim 10 , wherein determining an initial template comprises: down-sampling the initial ROI to the specified size; and determining a color gradient of the down-sampled initial ROI. 12. The digital image capture unit of claim 11 , wherein the one or more processors are further configured for determining a color descriptor of the down-sampled initial ROI. 13. The digital image capture unit of claim 12 , wherein identifying a first region of the first frame comprises: identifying a temporary region of the first frame corresponding to the initial ROI of the initial frame; and overscanning the temporary region. 14. The digital image capture unit of claim 13 , wherein the cost function is based on a congruence between the initial template and each n-by-n sub-region of the first frame's first region, wherein ‘n’ indicates the amount of overscan of the temporary region. 15. The digital image capture unit of claim 14 , wherein determining a second metric value for each of the putative ROIs comprises, for each putative ROI: selecting a region centered about the first location, the region having a size; converting the region to the specified size; finding an edge map of the converted region; and finding a value indicative of the congruence between the initial template and the edge map of the converted region. 16. The digital image capture unit of 9 , wherein the one or more processors are further configured for: determining a first edge map of the first frame based on the first ROI; combining the first edge map with a plurality of other edge maps to generate an updated template, wherein each of the plurality of other edge maps corresponds to an ROI of a different frame from the temporal sequence of frames, each of the plurality of other frames arriving earlier in the temporal sequence of frames than the first frame; and using the updated template as the initial template when evaluating a next frame from the temporal sequence of frames, the next frame arriving later in the temporal sequence of frames than the first frame. 17. A non-transitory program storage device comprising instructions stored thereon to cause one or more processors to: receive an initial frame from a temporal sequence of frames, the initial frame having an initial region-of-interest (ROI), every ROI having a size and location; determine an initial template of the initial frame based on the initial ROI and a specified size; receive a first frame from the temporal sequence of frames, the first frame arriving later in the temporal sequence of frames than the initial frame; identify a first region of the first frame based on the initial ROI; find a first plurality of first metric values based on the first region and a cost function; determine a first location of a first ROI of the first frame based on the plurality of first metric values; determine a second plurality of putative ROIs for the first frame, each putative ROI having a different

Assignees

Inventors

Classifications

  • G06T7/248Primary

    involving reference images or patches · CPC title

  • Determination of colour characteristics · CPC title

  • Scaling of whole images or parts thereof, e.g. expanding or contracting · CPC title

  • Color image · CPC title

  • Edge detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9773192B2 cover?
Techniques to identify and track a pre-identified region-of-interest (ROI) through a temporal sequence of frames/images are described. In general, a down-sampled color gradient (edge map) of an arbitrary sized ROI from a prior frame may be used to generate a small template. This initial template may be used to identify a region of a new or current frame that may be overscan and used to create a…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/248. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).