What technology area does this patent fall under?

Primary CPC classification G11B27/11. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 11 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Robust tracking of objects in videos

US10319412B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10319412-B2
Application number	US-201615353186-A
Country	US
Kind code	B2
Filing date	Nov 16, 2016
Priority date	Nov 16, 2016
Publication date	Jun 11, 2019
Grant date	Jun 11, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure is directed toward systems and methods for tracking objects in videos. For example, one or more embodiments described herein utilize various tracking methods in combination with an image search index made up of still video frames indexed from a video. One or more embodiments described herein utilize a backward and forward tracking method that is anchored by one or more key frames in order to accurately track an object through the frames of a video, even when the video is long and may include challenging conditions.

First claim

Opening claim text (preview).

What is claimed is: 1. In a digital environment for tracking objects in videos, a method of identifying objects in videos comprising: receiving a video; extracting a plurality of video frames from the video; generating an image search index from the plurality of video frames; receiving an indication of a query object within one or more key frames of the plurality of video frames; and for each of the plurality of video frames in the image search index: determining a similarity score between a key frame and the video frame based on a search area in the key frame, wherein a size of the search area is determined based on a distance between the key frame and the video frame, and generating a voting map that utilizes the determined similarity score to localize the query object in the video frame. 2. The method as recited in claim 1 , further comprising identifying one or more auxiliary key frames. 3. The method as recited in claim 2 , wherein identifying one or more auxiliary key frames comprises: selecting a candidate video frame from the image search index; determining, based on a spatially-constrained area within the key frame, a similarity between the candidate video frame and each of the one or more key frames; and determining that the similarity between the candidate video frame and a key frame of the one or more key frames is greater than a predetermined threshold; re-categorizing, based on the similarity being greater than the predetermined threshold, the candidate video frame as an auxiliary key frame. 4. The method as recited in claim 3 , further comprising: determining a first candidate query object for the video frame based on the key frame; determining a second candidate query object for the video frame based on the auxiliary key frame; weighting a similarity score for the first candidate query object using a time decay function; weighting a similarity score for the second candidate query object using the time decay function; and selecting as the query object one of the first candidate query object or the second candidate query object that has the maximum weighted similarity score. 5. The method as recited in claim 1 , wherein generating the image search index comprises: identifying one or more video frames in the received video; extracting one or more features from each of the one or more video frames. 6. The method as recited in claim 1 , further comprising redacting the query object from the video frames in which the query object is identified. 7. The method as recited in claim 6 , wherein redacting the query object from the video frames in which the query object is identified comprises: identifying, within each of the video frames in which the query object is identified, an area around the localized query object; changing a color of pixels within the area around the localized query object. 8. The method as recited in claim 1 , further comprising determining a location of the search area based on a location of the query object in a keyframe. 9. The method as recited in claim 1 , further comprising: sequentially determining similarity scores working backward and forward from the key frame, adjusting the determined similarity scores using penalty variables, wherein a penalty variable for a given similarity score is based on the given similarity score, a penalty variable for a previous similarity score, and a lower threshold. 10. A system for tracking objects in videos comprising: a memory comprising a video; a computing device, storing instructions thereon that, when executed by the computing device, cause the system to: extract a plurality of video frames from the video; generate an image search index from the plurality of video frames by extracting one or more features from each of the video frames; receive an indication of a query object within one or more key frames of the plurality of video frames and a location of the query object with the one or more key frames; and for each of the plurality of video frames: determine a similarity score between a key frame and a video frame based on a search area in the key frame by comparing features of the query object in the key frame to features of the video frame within the search area, wherein a size of the search area is determined based on a distance between the key frame and the video frame, and generate a voting map that utilizes the determined similarity score to localize the query object in the video frame. 11. The system as recited in claim 10 , wherein the instructions, when executed by the computing device, further cause the system to: track backward from the key frame to identify the query object by performing acts comprising: identifying a bounding box around the query object in the key frame, identifying a preceding candidate video frame that has an earlier time stamp than the key frame, identifying a search area for the preceding candidate video frame based on a location of the bounding box and a distance from the key frame to the preceding candidate video frame, and track forward from the key frame to identify the query object by performing acts comprising: identifying a subsequent candidate video frame that has a later time stamp than the key frame, identifying a search area for the subsequent candidate video frame based on a location of the bounding box and a distance from the key frame to the subsequent candidate video frame. 12. The system as recited in claim 11 , wherein the instructions, when executed by the computing device, further cause the system to redact the query object from the video frames in which the query object is identified; and generate a redacted video by merging the video frames in which the query object has been redacted with a remainder of the plurality of video frames based on time stamps associated with each video frame. 13. A non-transitory computer-readable medium storing instructions thereon that, when executed by at least one processor, cause a computer system to: extract a plurality of video frames from the video; generate an image search index from the plurality of video frames by extracting one or more features from each of the video frames; receive an indication of a query object within one or more key frames of the plurality of video frames and a location of the query object with the one or more key frames; and for each of the plurality of video frames: determine a similarity score between a key frame and a video frame based on a search area in the key frame, wherein a size of the search area is determined based on a distance between the key frame and the video frame, and generate a voting map that utilizes the determined similarity score to localize the query object in the video frame. 14. The non-transitory computer-readable medium as recited in claim 13 , further storing instructions thereon that, when executed by the at least one processor, cause the system to identify one or more auxiliary key frames. 15. The non-transitory computer-readable medium as recited in claim 14 , wherein identifying one or more auxiliary key frames comprises: selecting a candidate video frame from the image search index; determining, based on a spatially-constrained area within the key frame, a similarity between the candidate video frame and each of the one or more key frames; and determining that the similarity between the candidate video frame and a key frame of the one or more key frames is greater than a predetermined threshold; and re-categorizing, based on the similarity being greater than the predetermined threshold, the candidate video frame as an au

Assignees

Adobe Inc

Inventors

Classifications

G06V10/255
Detecting or recognising potential candidate objects based on visual cues, e.g. shapes · CPC title
G06K9/00744
Physics · mapped topic
G11B27/11Primary
by using information not detectable on the record carrier · CPC title
G06T2207/10016
Video; Image sequence · CPC title
G06K9/3241
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 62108672

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10319412B2 cover?: The present disclosure is directed toward systems and methods for tracking objects in videos. For example, one or more embodiments described herein utilize various tracking methods in combination with an image search index made up of still video frames indexed from a video. One or more embodiments described herein utilize a backward and forward tracking method that is anchored by one or more ke…
Who is the assignee on this patent?: Adobe Inc
What technology area does this patent fall under?: Primary CPC classification G11B27/11. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 11 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).