Text to image translation

US9678992B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9678992-B2
Application numberUS-201113110282-A
CountryUS
Kind codeB2
Filing dateMay 18, 2011
Priority dateMay 18, 2011
Publication dateJun 13, 2017
Grant dateJun 13, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are described for online real time text to image translation suitable for virtually any submitted query. Semantic classes and associated analogous items for each of the semantic classes are determined for the submitted query. One or more requests are formulated that are associated with analogous items. The requests are used to obtain web based images and associated surrounding text. The web based images are used to obtain associated near-duplicate images. The surrounding text of images is analyzed to create high-quality text associated with each semantic class of the submitted query. One or more query dependent classifiers are trained online in real time to remove noisy images. A scoring function is used to score the images. The images with the highest score are returned as a query response.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: under control of one or more processors configured with executable instructions: receiving a search query; determining one or more semantic classes applicable to the search query; identifying multiple terms that are (i) analogous to one another and (ii) within one or more of the semantic classes; obtaining web images corresponding to at least a subset of the multiple terms; obtaining near-duplicate images of a subset of the web images associated with the one or more semantic classes; removing noisy images from the near-duplicate images to create a denoised image set; determining representative text associated with the denoised image set; based at least in part on the representative text associated with the denoised image set, selecting at least one representative image for the one or more semantic classes; training at least one query-dependent classifier based at least in part on the representative text; employing the at least one query-dependent classifier to create the denoised image set for individual ones of the semantic classes; and selecting the at least one representative image from the denoised image set as an image answer for the search query. 2. The method of claim 1 , the method further comprising determining the subset of the multiple terms associated with the one or more semantic classes based at least in part on: ranking the multiple terms based at least in part on a relevance to an associated one of the semantic classes; and selecting the subset of the multiple terms based at least in part on the ranking the multiple terms. 3. The method of claim 2 , wherein obtaining the web images comprises: formulating requests, individual ones of the formulated requests having a unique analogous term of the subset of the multiple terms; and submitting the formulated requests to one or more search engines to obtain the web images. 4. The method of claim 1 , the method further comprising determining the subset of the web images associated with the one or more semantic classes by: ranking the web images associated with a corresponding one of the one or more semantic classes; and selecting the subset of the web images for the one or more semantic classes based on the ranking the web images. 5. The method of claim 4 , the method further comprising determining a number of the near-duplicate images available for individual ones of the web images, wherein ranking the web images is based at least in part on: the number of the near-duplicate images associated with individual ones of the web images; and an analysis of text associated with individual ones of the web images. 6. The method of claim 1 , the method further comprising: compiling text of the near-duplicate images associated with the one or more semantic classes into a corresponding training document; removing stop words from the training document; and removing terms having a frequency of occurrence that is less than a predetermined threshold from the training document. 7. The method of claim 6 , further comprising: analyzing the text using at least one of a chi-squared scheme, a term frequency-inverse document frequency (td-idf) scheme or a term frequency scheme to weight textual features in the training document; and selecting the representative text based at least in part on the weight associated with the textual features in the training document. 8. The method of claim 1 , wherein selecting the at least one representative image comprises: ranking images of the denoised image set based at least in part on a relevance score that measures a similarity of individual ones of the images with an associated semantic class and a confidence score proportional to a number of nearest neighbors associated with individual ones of the images; and selecting the at least one representative image from the denoised image set based at least in part on the ranking. 9. A system comprising: memory; one or more processors communicatively coupled to the memory; and instructions stored on the memory and when executed by the one or more processors, the instructions configure the system to: obtain semantic classes for a search query; identifying multiple terms that are (i) analogous to one another and (ii) within one or more of the semantic classes; collect web based images associated with a subset of the multiple terms; obtain near-duplicate images for a subset of the web based images; remove noisy images from the near-duplicate images to create a denoised image set; based at least in part on representative text associated with the denoised image set, select at least one representative image for individual ones of the semantic classes; train at least one query-dependent classifier or multiple one-on-one query-dependent classifiers based at least in part on representative text associated with the near-duplicate images; employ the at least one query-dependent classifier or multiple one-on-one query-dependent classifiers to create the denoised image set for individual ones of the one or more semantic classes; use a scoring function to rank images in the denoised image set based at least in part on a relevance score that measures a similarity of individual images of an associated denoised image set with its corresponding semantic class, and a confidence score proportional to a number of nearest neighbors associated with individual ones of the images of an associated denoised image set; and determine at least one representative image from the denoised image set as an image answer for the search query based at least in part on the rank of the individual images in the denoised image set. 10. The system of claim 9 , wherein the instructions further configure the system to: determine the subset of the multiple terms associated with a corresponding one of the semantic classes by ranking the multiple terms based at least in part on a relevance to the corresponding one of the semantic classes; and select the subset of the multiple terms based at least in part on the ranking the multiple terms. 11. The system of claim 9 , wherein the instructions further configure the system to determine a number of available near-duplicate images associated with individual ones of the web based images. 12. The system of claim 11 , wherein the instructions further configure the system to rank the web based images based at least in part on the number of the available near-duplicate images associated with individual ones of the web images. 13. The system of claim 12 , wherein the instructions further configure the system to determine the subset of the web based images based of the rank of the web based images. 14. The system of claim 9 , wherein the instructions further configure the system to format the web based images based at least in part on one or more image features of the web based images. 15. The system of claim 14 , wherein the instructions further configure the system to obtain the near-duplicate images for the subset of the web based images having the format. 16. The system of claim 9 , wherein the instructions further configure the system to: determine the representative text based at least in part on an analysis of surrounding text associated with the near-duplicate images; analyze terms in the requests to identify terms that result in a semantic gap between text and image domains; and replace, modify or remove identified terms in the requests to reduce the semantic gap. 17. The system of claim 9 , wherein the relevance score measures the similarit

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • Physics · mapped topic

  • using metadata automatically derived from the content · CPC title

  • G06F16/58Primary

    Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9678992B2 cover?
Techniques are described for online real time text to image translation suitable for virtually any submitted query. Semantic classes and associated analogous items for each of the semantic classes are determined for the submitted query. One or more requests are formulated that are associated with analogous items. The requests are used to obtain web based images and associated surrounding text. …
Who is the assignee on this patent?
Wang Xin-Jing, Zhang Lei, Ma Wei-Ying, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F17/30265. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 13 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).