Method and terminal device for retargeting images
US-2015371367-A1 · Dec 24, 2015 · US
US2016358036A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016358036-A1 |
| Application number | US-201615240838-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 18, 2016 |
| Priority date | May 18, 2011 |
| Publication date | Dec 8, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques describe submitting a video clip as a query by a user. A process retrieves images and information associated with the images in response to the query. The process decomposes the video clip into a sequence of frames to extract the features in a frame and to quantize the extracted features into descriptive words. The process further tracks the extracted features as points in the frame, a first set of points to correspond to a second set of points in consecutive frames to construct a sequence of points. Then the process identifies the points that satisfy criteria of being stable points and being centrally located in the frame to represent the video clip as a bag of descriptive words for searching for images and information related to the video clip.
Opening claim text (preview).
1 .- 20 . (canceled) 21 . A system comprising: one or more processors; memory communicatively coupled to the one or more processors; an image-retrieval model module, stored in the memory and communicatively coupled to the one or more processors, configured to: generate a representation of a video clip comprising an object of interest; and retrieve images from a database based at least in part on the representation of the video clip; and a vector space model module, stored in the memory and communicatively coupled to the one or more processors, configured to: analyze the images from the database and the video clip to determine one or more similarities between the images and the video clip; and calculate a similarity score for each image of the images based on the one or more similarities between the images from the database and the video clip to identify candidate search images from the images. 22 . The system of claim 21 , wherein the representation of the video clip comprises one or more descriptive words and each image of the images is associated with at least one descriptive word of the one or more descriptive words. 23 . The system of claim 21 , wherein the similarity scores are based at least in part on the representation of the video clip and element-wise multiplication of vectors, each vector of the vectors representing an inverted image frequency of a descriptive word used to describe the images and the video clip. 24 . The system of claim 21 , further comprising an image-retrieval application module, stored in the memory and communicatively coupled to the one or more processors, configured to: receive the video clip submitted as a query; and extract features from one or more frames of the video clip. 25 . The system of claim 24 , the image-retrieval application module further configured to: construct points of the features in consecutive frames of the one or more frames; identify center points that are located in a center of a frame of the consecutive frames; determine that an amount of the center points is less than a threshold number; and in response to determining that the amount of the center points is less than the threshold number, filtering out the center points. 26 . The system of claim 24 , the image-retrieval application module further configured to: construct points of the features in consecutive frames of the one or more frames; identify center points that are located in a center of a frame of the consecutive frames; determine that an amount of the center points is greater than a threshold number; and in response to determining that the amount of the center points is greater than the threshold number, creating the representation of the object of interest based at least in part on the center points. 27 . The system of claim 26 , the image-retrieval application module further configured to construct the vector space module based at least in part on the representation of the object of interest. 28 . A method comprising: generating, by an image-retrieval model, a representation of a first frame of a video clip comprising an object of interest; retrieving, by the image-retrieval model, images from a database based at least in part on the representation of the first frame of the video clip; comparing, by a vector space model, the images from the database and the representation of the first frame to identify similarities between the images and the representation of the first frame; calculating, by the vector space model, a first similarity score for each image of the images based on the similarities between the images from the database and the representation of the first frame to identify candidate search images from the images. 29 . The method of claim 28 , further comprising: ranking, by an image-retrieval application module, the candidate search images from the images based at least in part on the respective first similarity scores. 30 . The method of claim 29 , further comprising: generating, by the image-retrieval model, a representation of a second frame of the video clip comprising the object of interest; comparing, by the vector space model, the candidate search images and the representation of the second frame to identify second similarities between the candidate search images and the representation of the second frame; calculating, by the vector space model, a second similarity score for each of the candidate search images based on the second similarities; and re-ranking, by the image-retrieval application module, at least one of the candidate search images based at least in part on the second similarity scores. 31 . The method of claim 28 , wherein: the representation of the first frame of the video clip comprises one or more descriptive words; retrieving the images from the database is based at least in part on the one or more descriptive words; and further comprising: calculating gradients of functions of the candidate search images and the first frame of the video clip; combining the first similarity scores with an average of the gradients of the functions; and ranking the candidate search images based at least in part on the combining the first similarity scores with the average of the gradients of the functions. 32 . The method of claim 28 , further comprising: receiving, by an image-retrieval application module, the video clip submitted as a query; and extracting features from consecutive frames of the video clip. 33 . The method of claim 32 , further comprising: constructing points representing the features in the consecutive frames; identifying center points of the points that are located in a center of a frame of the consecutive frames; determining that an amount of the center points is less than a threshold number; and in response to determining that the amount of the center points is less than the threshold number, filtering out the center points. 34 . The method of claim 32 , further comprising: constructing points representing the features in the consecutive frames; identifying center points of the points that are located in a center of a frame of the consecutive frames; determining that an amount of the center points is greater than a threshold number; and in response to determining that the amount of the center points is greater than the threshold number, creating the representation of the object of interest based at least in part on the center points. 35 . The method of claim 28 , wherein the first similarity scores are based at least in part on the representation of the first frame of the video clip and element-wise multiplication of vectors, each vector of the vectors representing an inverted image frequency of a descriptive word used to describe the images and the video clip. 36 . One or more computer-readable storage media storing instructions that, when executed by a processor, perform acts comprising: generating, by an image-retrieval model, a representation of a first frame of a video clip comprising an object of interest; retrieving, by the image-retrieval model, images from a database based at least in part on the representation of the first frame of the video clip; comparing, by a vector space model, the images from the database and the representation of the first frame to identify similarities between the images and the representation of the first frame; calculating, by the vector space model, a first similarity score for each image of the images based on the similarities between the images from the database and the representation of th
Salient features, e.g. scale invariant feature transforms [SIFT] · CPC title
by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title
involving foreground-background segmentation · CPC title
using low-level visual features of the video content · CPC title
using original textual content or text extracted from visual content or transcript of audio data · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.