Spell correcting long queries

US9317606B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9317606-B1
Application numberUS-201313757271-A
CountryUS
Kind codeB1
Filing dateFeb 1, 2013
Priority dateFeb 3, 2012
Publication dateApr 19, 2016
Grant dateApr 19, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer implemented method and system for spell correcting terms within a string of terms that a computer system receives from a computer readable data string representative of a user search query.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method, comprising: receiving a search query; identifying a plurality of search terms of the search query; determining a model term score for each of the identified search terms, the model term score for a given search term of the identified search terms being indicative of a probability of misspelling of the given search term, wherein determining the model term score for the given search term includes determining whether the given search term is in a dictionary of a language model and wherein the model term score for the given search term is determined independent of any alternative suggestions that are alternative spellings of the given search term; selecting candidate search terms from the identified search terms of the search query based on the model term scores, the candidate search terms being a subset of the identified search terms of the search query, wherein selecting a candidate search term of the candidate search terms comprises selecting the candidate search term based on the model term score for the candidate search term being more indicative of misspelling than the model term scores of other of the identified search terms not selected as one of the candidate search terms; and providing the candidate search terms to a spell correction system. 2. The computer implemented method as recited in claim 1 , wherein determining the model term score for each of the identified search terms further includes: determining whether each of the identified search terms is in a dictionary of a language model; and determining a language model term score for each of the identified search terms found in the dictionary of the language model. 3. The computer implemented method as recited in claim 2 , wherein determining the model term score for each of the identified search terms further includes: determining an n-gram model term score for each of the identified search terms not found in the dictionary of the language model, where each n-gram model term score has a value more indicative of misspelling than any of the language model term scores. 4. The computer implemented method as recited in claim 3 , wherein the language model term score is based on the probability of misspelling estimated by the language model using term frequency, and where determining the n-gram model term score is based on the probability of a character sequence estimated by the character n-gram model. 5. The computer implemented method as recited in claim 4 , wherein selecting the candidate search terms comprises selecting a predetermined number of the identified search terms having model term scores that satisfy a threshold, and where providing the candidate search terms to the spell correction system comprises providing the predetermined number of the identified search terms to the spell correction system and receiving alternative versions of the candidate search terms from the spell correction system. 6. The computer implemented method as recited in claim 5 , further comprising: determining a total number of terms M to be spell corrected; determining a context number of terms that is desired on either side of each of N terms having model term scores that satisfy the threshold; multiplying the context number of terms by 2 resulting in a product (P); adding one to the product resulting in (P+1); dividing total number of terms M by the result (P+1) thereby providing the number N; and providing the context number of terms on either side of each of the N terms having model term scores that satisfy the threshold to the spell correction system. 7. The computer implemented method as recited in claim 6 , further comprising: providing an improved score to one of two or more terms having the same model term scores based on proximity to a term with a model term score that satisfies the threshold. 8. The computer implemented method as recited in claim 1 , further comprising: receiving a suggestion from the spell correction system, the suggestion representative of suggesting an alternative spelling to one or more of the identified search terms. 9. The computer implemented method in claim 1 , wherein determining the model term score for each of the identified search terms further includes: determining if each of the identified search terms is in a language model and determining a language model term score for each of the identified search terms determined to be in the language model, wherein the language model term score is based on the probability of misspelling estimated by the language model using term frequency; and determining an n-gram model term score for each of the identified search terms received and determined to not be in the language model, where determining the n-gram model term score is based on the probability of a character sequence estimated by the character n-gram model. 10. The computer implemented method of claim 9 , wherein each n-gram model term score has a value more indicative of misspelling than any of the language model term scores. 11. The computer implemented method as recited in claim 10 , further comprising: determining a total number of search terms to be spell corrected; determining a context number of terms that is desired on either side of each of N terms having model term scores that satisfy the threshold; multiplying the context number of terms by 2 resulting in a product (P); adding one to the product resulting in (P+1); dividing total number terms by the result (P+1) thereby providing the number N; and providing the context number of terms on either side of each of the N terms having model term scores that satisfy the threshold to the spell correction system. 12. The computer implemented method as recited in claim 11 , further comprising: providing a higher score to one of two or more terms having the same model term score based on order of appearance. 13. The computer implemented method as recited in claim 11 , further comprising: providing a higher score to one of two or more terms having the same model term score based on proximity to a term with a model term score that satisfies the threshold. 14. The computer implemented method as recited in claim 9 , further comprising: receiving a suggestion from the correction system, the suggestion representative of suggesting an alternative spelling to one or more of the search terms. 15. A system comprising one or more processors configured to perform operations comprising: receiving a search query; identifying a plurality of search terms of the search query; determining a model term score for each of the identified search terms, the model term score for a given search term of the identified search terms being indicative of a probability of misspelling of the given search term, wherein determining the model term score for the given search term includes determining whether the given search term is in a dictionary of a language model and wherein the model term score for the given search term is determined independent of any alternative suggestions that are alternative spellings of the given search term; selecting candidate search terms from the identified search terms of the search query based on the model term scores, the candidate search terms being a subset of the identified search terms of the search query, wherein selecting a candidate search term of the candidate search terms comprises selecting the candidate search term based on the model term score for the candidate search term being more indicative of misspelling than the model term scores of other of the identified search terms not selecte

Assignees

Inventors

Classifications

  • Orthographic correction, e.g. spell checking or vowelisation · CPC title

  • Syntactic pre-processing, e.g. stopword elimination, stemming · CPC title

  • G06F16/951Primary

    Indexing; Web crawling techniques · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9317606B1 cover?
A computer implemented method and system for spell correcting terms within a string of terms that a computer system receives from a computer readable data string representative of a user search query.
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/3335. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).