User interface for viewing clusters of images
US-9396214-B2 · Jul 19, 2016 · US
US10614366B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10614366-B1 |
| Application number | US-201615061641-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 4, 2016 |
| Priority date | Jan 31, 2006 |
| Publication date | Apr 7, 2020 |
| Grant date | Apr 7, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and Methods for multi-modal or multimedia image retrieval are provided. Automatic image annotation is achieved based on a probabilistic semantic model in which visual features and textual words are connected via a hidden layer comprising the semantic concepts to be discovered, to explicitly exploit the synergy between the two modalities. The association of visual features and textual words is determined in a Bayesian framework to provide confidence of the association. A hidden concept layer which connects the visual feature(s) and the words is discovered by fitting a generative model to the training image and annotation words. An Expectation-Maximization (EM) based iterative learning procedure determines the conditional probabilities of the visual features and the textual words given a hidden concept class. Based on the discovered hidden concept layer and the corresponding conditional probabilities, the image annotation and the text-to-image retrieval are performed using the Bayesian framework.
Opening claim text (preview).
The invention claimed is: 1. A method of extracting implicit concepts within a set of multimedia works, comprising: (a) receiving a plurality of portions of the set of multimedia works, each portion comprising semantic features and non-semantic features; (b) probabilistically determining, with at least one automated data processor, a set of semantic concepts inherent in the respective non-semantic features of the received portions, based on at least a Bayesian model, comprising a hidden concept layer formulated based on at least one joint probability distribution which models a probability that a respective semantic concept annotates a respective non-semantic feature that connects a semantic feature layer and a non-semantic feature layer, wherein the hidden concept layer is discovered by fitting a generative model to a training set comprising non-semantic features and annotation semantic features, the conditional probabilities of the non-semantic features and the annotation semantic features given a hidden concept class being determined based on an Expectation-Maximization (EM) based iterative learning procedure, the non-semantic features being generated from a plurality of respective Gaussian distributions, respectively corresponding to a semantic concept, each non-semantic feature having a conditional probability density function selectively dependent on a covariance matrix of non-semantic features belonging to the respective semantic concept; (c) determining, with the at least one automated data processor, a semantic concept vector for a respective multimedia work, dependent on at least the determined semantic concepts inherent in the respective non-semantic features of the received portions; and (d) at least one of storing and communicating information representing the determined semantic concept vector. 2. The method according to claim 1 , further comprising receiving a word as an input, and outputting at least one image or an identifier of at least one image corresponding to the word. 3. The method according to claim 1 , further comprising receiving an image as an input, and outputting at least one word or an identifier of at least one word corresponding to the image. 4. The method according to claim 1 , wherein said probabilistically determining comprises employing at least one conditional probability represented in the Bayesian model for associating words with an image, comprising a set of parameters stored in a memory representing the hidden concept layer which connects a non-semantic feature layer comprising a visual feature layer and a semantic feature layer comprising a word layer. 5. The method according to claim 4 , further comprising discovering the hidden concept layer by fitting the generative model to a training set comprising image and annotation words, wherein the conditional probabilities of the visual features and the annotation words given the hidden concept class are determined based on the Expectation-Maximization (EM) based iterative learning procedure. 6. The method according to claim 5 , wherein the Bayesian model comprises a semantic Bayesian framework representing an association of visual content with a plurality of semantic concepts, comprising at least one hidden layer formulated based on at least one joint probability distribution which models a probability that a word belonging to a respective semantic concept is an annotation word of respective visual content; wherein a set of visual content is mapped to the semantic Bayesian framework dependent on semantic concepts represented in respective visual content, using at least one automated processor which automatically determines a set of annotation words associated with the respective visual content; at least one implicit semantic concept is automatically extracted from a received query seeking elements of the set of visual content corresponding to at least one implicit semantic concept, using at least one automated processor; elements of the mapped set of visual content corresponding to the at least one extracted implicit semantic concept are automatically determined, using at least one automated processor; and the corresponding visual content is ranked in accordance with at least a correspondence to the at least one extracted implicit semantic concept. 7. The method according to claim 5 , wherein the hidden concept layer which connects a visual feature layer and a word layer which is discovered by fitting a generative model to a training set comprising images and annotation words. 8. The method according to claim 7 , wherein f i , i∈[1, N] denotes a visual feature vector of images in a training database, where N is the size of the database, w i , j∈[1, M] denotes the distinct textual words in a training annotation word set, where M is the size of annotation vocabulary in the training database, the visual features of images in the database, f i =[f i 1 , f i 2 , . . . , f i L ], i∈[1, N] are known i.i.d. samples from an unknown distribution, having a visual feature dimension L, the specific visual feature annotation word pairs (f i , w j ), i∈[1, N], j∈[1, M] are known i.i.d. samples from an unknown distribution, associated with an unobserved semantic concept variable z∈Z={z 1 , . . . z k }, in which each observation of one visual feature f∈F={f i , f 2 , . . . , f N } belongs to one or more concept classes z k and each observation of one word w∈V+{w 1 , w 2 , . . . , w M } in one image f i belongs to one concept class, in which the observation pairs or random variables (f i , w j ) are both assumed to be both generated independently assumed to be conditionally independent given the respective hidden concept z k , such that P(f i ,w j |z k )=p ℑ (f i |z k )P V (w j |z k ); the visual feature and word distribution is treated as a randomized data generation process, wherein a probability of a concept is represented as P z (z k ); a visual feature is selected f i ∈F with probability P ℑ (f i |z k ); and a textual word is selected w j ∈V with probability P V (w j |z k ), from which an observed pair (f i ,w j ) is obtained, such that a joint probability model is expressed as follows: P ( f i , w j ) = P ( w j ) P ( f i | w j )
Knowledge representation; Symbolic representation · CPC title
using colour · CPC title
Physics · mapped topic
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Bayesian classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.