Using Alternate Words As an Indication of Word Sense

US2016239490A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016239490-A1
Application numberUS-201313763198-A
CountryUS
Kind codeA1
Filing dateFeb 8, 2013
Priority dateFeb 8, 2013
Publication dateAug 18, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using alternate words as an indication of word sense. In one aspect, a method includes identifying a particular term. The method further includes identifying a first alternate term and a second alternate term for the particular term, and identifying a first sequence of terms that occurs in a text corpus, and includes the particular term among its terms. The method further includes determining a number of occurrences of a second sequence of terms in the text corpus. The second sequence of terms differs from the first sequence of terms only in that the first alternate term is substituted for the particular term and determining a number of occurrences of a third sequence of terms in the text corpus. The third sequence of terms differs from the first sequence of terms.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: obtaining a search query; identifying a particular term from the search query; determining that the particular term is, or is potentially, a polyseme or a homograph; in response to determining that the particular term is, or is potentially, a polyseme or a homograph, identifying a first alternate term and a second alternate term for the particular term; identifying a first sequence of terms that (i) occurs in a text corpus, (ii) includes the particular term among its terms, and (iii) is different than the search query; determining a number of occurrences of a second sequence of terms in the text corpus, wherein the second sequence of terms differs from the first sequence of terms only in that the first alternate term is substituted for the particular term; determining a number of occurrences of a third sequence of terms in the text corpus, wherein the third sequence of terms differs from the first sequence of terms only in that the second alternate term is substituted for the particular term; and determining, based at least on the number of occurrences of the second sequence of terms in the text corpus and the number of occurrences of the third sequence of terms in the text corpus, whether the first alternate term and the second alternate term indicate a same word sense of the particular term. 2 - 20 . (canceled) 21 . The method of claim 1 , wherein the first alternate term and the second alternate term comprises query term substitutions for the particular term. 22 . The method of claim 1 , wherein: the text corpus comprises a query log, and each sequence of terms comprises a search query that is stored in the query log, and that is different from the search query that includes the particular term. 23 . The method of claim 1 , wherein the first alternate term and the second alternate term are identified from a query log after the search query is received. 24 . The method of claim 1 , comprising: determining whether to expand the search query to include the first alternate term or the second alternate term based on determining that the first alternate term and the second alternate term indicate a same word sense of the particular term. 25 . The method of claim 1 , wherein determining whether the first alternate term and the second alternate term indicate a same sense of the particular term comprises determining whether second sequence of terms and the third sequence of terms both occur in the text corpus. 26 . The method of claim 1 , wherein determining whether the first alternate term and the second alternate term indicate a same sense of the particular term comprises determining whether second sequence of terms and the third sequence of terms both occur in the text corpus more than a predetermined number of times. 27 . The method of claim 1 , comprising: identifying a fourth sequence of terms that (i) occurs in the text corpus, (ii) includes the particular term among its terms, and (iii) is different than the search query and the sequence of terms; determining a number of occurrences of a fifth sequence of terms in the text corpus, wherein the fifth sequence of terms differs from the fourth sequence of terms only in that the first alternate term is substituted for the particular term; determining a number of occurrences of a sixth sequence of terms in the text corpus, wherein the sixth sequence of terms differs from the fourth sequence of terms only in that the second alternate term is substituted for the particular term; and wherein determining whether the first alternate term and the second alternate term indicate the same word sense of the particular term is further based on the number of occurrences of the fifth sequence of terms in the text corpus and the number of occurrences of the sixth sequence of terms in the text corpus. 28 . The method of claim 1 , comprising: comparing the number of occurrences of the second sequence of terms in the text corpus with the number of occurrences of the third sequence of terms in the text corpus; and generating a score for a substitution of the particular term by the first alternate term or the second alternate term based on comparing the number of occurrences of the second sequence of terms in the text corpus with the number of occurrences of the third sequence of terms in the text corpus. 29 . A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining a search query; identifying a particular term from the search query; determining that the particular term is, or is potentially, a polyseme or a homograph; in response to determining that the particular term is, or is potentially, a polyseme or a homograph, identifying a first alternate term and a second alternate term for the particular term; identifying a first sequence of terms that (i) occurs in a text corpus, (ii) includes the particular term among its terms, and (iii) is different than the search query; determining a number of occurrences of a second sequence of terms in the text corpus, wherein the second sequence of terms differs from the first sequence of terms only in that the first alternate term is substituted for the particular term; determining a number of occurrences of a third sequence of terms in the text corpus, wherein the third sequence of terms differs from the first sequence of terms only in that the second alternate term is substituted for the particular term; and determining, based at least on the number of occurrences of the second sequence of terms in the text corpus and the number of occurrences of the third sequence of terms in the text corpus, whether the first alternate term and the second alternate term indicate a same word sense of the particular term. 30 . The system of claim 29 , wherein the first alternate term and the second alternate term comprises query term substitutions for the particular term. 31 . The system of claim 29 , wherein: the text corpus comprises a query log, and each sequence of terms comprises a search query that is stored in the query log, and that is different from the search query that includes the particular term. 32 . The system of claim 29 , wherein the first alternate term and the second alternate term are identified from a query log after the search query is received. 33 . The system of claim 29 , wherein the operations comprise: determining whether to expand the search query to include the first alternate term or the second alternate term based on determining that the first alternate term and the second alternate term indicate a same word sense of the particular term. 34 . The system of claim 29 , wherein determining whether the first alternate term and the second alternate term indicate a same sense of the particular term comprises determining whether second sequence of terms and the third sequence of terms both occur in the text corpus. 35 . The system of claim 29 , wherein determining whether the first alternate term and the second alternate term indicate a same sense of the particular term comprises determining whether second sequence of terms and the third sequence of terms both occur in the text corpus more than a predetermined number of times. 36 . The system of claim 29 , wherein the operations comprise: identifying a fourth sequence of terms that (i) occurs in the text corpus, (ii)

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • using system suggestions (G06F16/3325 takes precedence) · CPC title

  • Query expansion · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016239490A1 cover?
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using alternate words as an indication of word sense. In one aspect, a method includes identifying a particular term. The method further includes identifying a first alternate term and a second alternate term for the particular term, and identifying a first sequence of terms that occurs in a t…
Who is the assignee on this patent?
Google Inc, Google Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/3053. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 18 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).