Part-of-speech tagging for ranking search results

US9514221B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9514221-B2
Application numberUS-201313828316-A
CountryUS
Kind codeB2
Filing dateMar 14, 2013
Priority dateMar 14, 2013
Publication dateDec 6, 2016
Grant dateDec 6, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer-readable storage media are provided for utilizing part-of-speech (POS) tagging of both the words included in a search query and the words included in potential search result documents to improve query alteration accuracy and search result ranking. Upon receiving a search query, POS tags are assigned to the words comprising the query to create query word-tag pairs. The query word-tag pairs are utilized to reformulate the query and are compared with document word-tag pairs included in a plurality of potential search result documents to determine a degree of similarity. The degree of similarity is utilized as an input in scoring and/or ranking the relevance of the potential search result documents with respect to one another.

First claim

Opening claim text (preview).

What is claimed is: 1. One or more computer-readable storage media storing computer-useable instructions that, when used by one or more computing devices, cause the one or more computing devices to perform a method for utilizing part-of-speech tagging in ranking potential search result documents, the method comprising: receiving a search query including one or more words; reformulating the received query by assigning part-of-speech tags to at least a portion of the one or more words included in the search query to create one or more query word-tag pairs; identifying document word-tag-pairs in a document word-tag-pairs searchable index using the reformulate query to compare the one or more query word-tag pairs to the one or more document word-tag pairs included in a plurality of potential search result documents to determine a degree of similarity; and using the degree of similarity to score the relevance of each of the plurality of potential search result documents to rank the plurality of potential search result documents based on the comparison of the query word-tag pairs from the search query and the document word-tag pairs from the plurality of potential search result documents; wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a noun, the respective part-of- speech tag further identifies the noun as plural, possessive, or singular; and wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a verb, the respective part-of-speech tag further identifies the verb based upon tense or aspect. 2. The one or more computer-readable storage media of claim 1 , wherein the part-of-speech tags assigned to the one or more words identifies the one or more words as one of a noun, a verb, an adjective, an adverb, a determiner, an article, a preposition, a pronoun, a conjunction, and an interjection. 3. The one or more computer-readable storage media of claim 1 , wherein using the degree of similarity to score the relevance of each of the plurality potential search result documents comprises using the degree of similarity for each potential search result document to rank the potential search result documents of the plurality of potential search result documents. 4. The one or more computer-readable storage media of claim 1 , wherein the method further comprises using the part-of-speech tags assigned to the portion of the one or more words included in the search query to reformulate the search query. 5. The one or more computer-readable storage media of claim 1 , wherein at least one potential search result document of the plurality of potential search result documents includes one or more words, wherein the method further comprises assigning part-of-speech tags to at least a portion of the one or more words included in the at least one potential search result document to create the one or more document word-tag pairs. 6. The one or more computer-readable storage media of claim 5 , wherein the method further comprises indexing the document word-tag pairs in a searchable index. 7. The one or more computer-readable storage media of claim 5 , wherein each part-of-speech tag is one of a grammatical part-of-speech tag and a syntactical part-of-speech tag. 8. A method being performed by one or more computing devices including at least one processor, the method for utilizing part-of-speech tagging in ranking potential search result documents, the method comprising: reformulating a received search query by assigning part-of-speech tags to words in the search query to create query word-tag pairs; assigning part-of-speech tags to words in a plurality of potential search result documents to create document word-tag pairs searchable index; identifying the document word-tag-pairs in the searchable index using the reformulate query; utilizing a degree of similarity score between matching query word-tag pairs and document word-tag pairs to rank the plurality of potential search result documents relative to one another, wherein a first search result document having a matching query word-tag pair and document word-tag pair is ranked higher than a second search result having a word match with the search query where the word match has a different part-of-speech tag in the second search result than in the search query, wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a noun, the respective part-of- speech tag further identifies the noun as plural, possessive, or singular; and wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a verb, the respective part-of-speech tag further identifies the verb based upon tense or aspect. 9. The method of claim 8 , wherein utilizing matching query word-tag pairs and document word-tag pairs to rank the plurality of potential result documents relative to one another comprises: comparing the query word-tag pairs with the document word-tag pairs to determine a degree of similarity for each document of the plurality of potential search result documents; and utilizing the respective degrees of similarity as an input to rank the plurality of potential search result documents relative to one another. 10. The method of claim 8 , wherein the part-of-speech tags assigned to the words identifies the words as nouns, verbs, adjectives, adverbs, determiners, articles, prepositions, pronouns, conjunctions, and interjections. 11. The method of claim 8 , wherein the method further comprises using the part-of-speech tags assigned to the words in the search query to reformulate the search query. 12. The method of claim 8 , wherein the method further comprises indexing the document word-tag pairs in a searchable index. 13. A system comprising: an information retrieval engine having one or more processors and one or more computer-readable storage media; a data store coupled with the information retrieval engine, wherein the information retrieval engine: receives a query including one or more words; assigns part-of-speech tags to at least a portion of the one or more words included in the query to create query word-tag pairs, wherein a first word of the search query is associated with a first query word-tag pair; reformulates the query using the assigned part-of-speech tags; and utilizing the reformulated query, determines potential documents for retrieval by matching the query word-tag pairs of the search query with document word-tag pairs included in the potential documents and determine a degree of similarity to rank the search query result, wherein the matching includes identifying the first word in a first document and a second document, wherein the first document includes a query word-tag pair and document word-tag pair match for the first word, and wherein the second document has a document word-tag pair for the first word that is different from the query word-tag pair for the first word, wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a noun, the respective part-of- speech tag further identifies the noun as plural, possessive, or singular; and wherein upon the part-of-speech tags assigned to the one or more words identifying at least one of the one or more words as a verb, the respective part-of-speech tag further identifies the verb based upon tense or aspect. 14. The system of claim 13 , wherein the information retrieval engine further determines a degree of similarity betw

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9514221B2 cover?
Systems, methods, and computer-readable storage media are provided for utilizing part-of-speech (POS) tagging of both the words included in a search query and the words included in potential search result documents to improve query alteration accuracy and search result ranking. Upon receiving a search query, POS tags are assigned to the words comprising the query to create query word-tag pairs.…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/3344. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 06 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).