Using parts-of-speech tagging and named entity recognition for spelling correction

US10762293B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10762293-B2
Application numberUS-97684910-A
CountryUS
Kind codeB2
Filing dateDec 22, 2010
Priority dateDec 22, 2010
Publication dateSep 1, 2020
Grant dateSep 1, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques to automatically correct or complete text are disclosed. An entered text and a context data indicating a context in which the entered text is used are received. Examples of context data include additional words and/or a phrase or sentence in which the entered text occurs. A replacement candidate to replace the entered text is determined based on the entered text and the context data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of automatically correcting or completing text, comprising: receiving entered text from a user and context data indicating a context in which the entered text is used, wherein the entered text is a word; determining without user intervention, based on the word and the context data, a replacement candidate to replace the word, the determining including: using the context data to assign to the replacement candidate a score indicating a degree of confidence that the replacement candidate should be suggested; determining a statistically expected part of speech of the word; and selecting the replacement candidate based at least in part on a determination that a part of speech of the replacement candidate matches the statistically expected part of speech of the word; and providing the selected replacement candidate to the user as a suggested correction. 2. The method of claim 1 , wherein the replacement candidate comprises a more correct or complete word or phrase than the entered text. 3. The method of claim 1 , wherein the context data comprises one or more words that occur in a same sentence as the entered text. 4. The method of claim 1 , wherein the context data is used to determine one or more features of the entered text. 5. The method of claim 4 , wherein the features include one or more of the following: an identification as a named entity, a prefix, a suffix, and capitalization. 6. The method of claim 1 , further comprising receiving further context data as additional text is entered and using the further context data to update an evaluation of one or more replacement candidates. 7. The method of claim 1 , wherein determining the replacement candidate includes using the context data to evaluate the replacement candidate. 8. The method of claim 1 , wherein determining the replacement candidate further includes selecting the replacement candidate based at least in part on a determination that the score exceeds a selection threshold. 9. The method of claim 1 , wherein determining the replacement candidate includes determining based at least in part on the context data that the replacement candidate is more likely correct than one or more alternative replacement candidates. 10. The method of claim 1 , further comprising using a statistical language model, the word, and the context data to determine the statistically expected part of speech of the word. 11. The method of claim 10 , further comprising generating the statistical model. 12. The method of claim 11 , wherein generating the statistical model includes augmenting a commercially available annotated corpus with annotated content comprising one or more of blog entries, online comments, comments posted on online social networks, and other user generated online content. 13. The method of claim 1 , further comprising: providing the selected replacement candidate to the user as a selectable suggested correction. 14. The method of claim 1 , wherein determining the statistically expected part of speech of the word comprises determining a statistically expected lexical category of the word. 15. A system, comprising: an input device configured to receive user inputs comprising entered text; and a processor coupled to the input device and configured to: receive entered text from a user entered using the input device and context data indicating a context in which the entered text is used, wherein the entered text is a word; determine without user intervention, based on the word and the context data, a replacement candidate to replace the word, the determining including: using the context data to assign to the replacement candidate a score indicating a degree of confidence that the replacement candidate should be suggested; determining a statistically expected part of speech of the word; and selecting the replacement candidate based at least in part on a determination that a part of speech of the replacement candidate matches the statistically expected part of speech of the word; and provide the selected replacement candidate to the user as a suggested correction. 16. The system of claim 15 , wherein the context data comprises one or more words that occur in a same sentence as the entered text. 17. The system of claim 15 , wherein the processor is configured to update an evaluation of the replacement candidate as additional context data is received. 18. The system of claim 15 , wherein the processor is configured to provide the selected replacement candidate to the user as a selectable suggested correction. 19. The system of claim 15 , wherein determining the statistically expected part of speech of the word comprises determining a statistically expected lexical category of the word. 20. The system of claim 15 , wherein the replacement candidate comprises a more correct or complete word or phrase than the entered text. 21. The system of claim 15 , wherein the context data is used to determine one or more features of the entered text. 22. The system of claim 21 , wherein the features include one or more of the following: an identification as a named entity, a prefix, a suffix, and capitalization. 23. The system of claim 15 , wherein determining the replacement candidate includes using the context data to evaluate the replacement candidate. 24. The system of claim 15 , wherein determining the replacement candidate further includes selecting the replacement candidate based at least in part on a determination that the score exceeds a selection threshold. 25. The system of claim 15 , wherein determining the replacement candidate includes determining based at least in part on the context data that the replacement candidate is more likely correct than one or more alternative replacement candidates. 26. The system of claim 15 , wherein the processor is further configured to use a statistical language model, the word, and the context data to determine the statistically expected part of speech of the word. 27. The system of claim 26 , wherein the processor is further configured to: generate the statistical model. 28. The system of claim 27 , wherein generating the statistical model includes augmenting a commercially available annotated corpus with annotated content comprising one or more of Hog entries, online comments, comments posted on online social networks, and other user generated online content. 29. A non-transitory computer readable storage medium storing one or more programs for execution by an electronic device, the one or more programs comprising instructions for automatically correcting or completing text, including: receiving entered text from a user and context data indicating a context in which the entered text is used, wherein the entered text is a word; determining without user intervention, based on the word and the context data, a replacement candidate to replace the word, the determining including: using the context data to assign to the replacement candidate a score indicating a degree of confidence that the replacement candidate should be suggested; determining a statistically expected part of speech of the word; and selecting the replacement candidate based at least in part on a determination that a part of speech of the replacement candidate matches the statistically expected part of speech of the word; and providing t

Assignees

Inventors

Classifications

  • Annotation, e.g. comment data or footnotes · CPC title

  • G06F40/232Primary

    Orthographic correction, e.g. spell checking or vowelisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10762293B2 cover?
Techniques to automatically correct or complete text are disclosed. An entered text and a context data indicating a context in which the entered text is used are received. Examples of context data include additional words and/or a phrase or sentence in which the entered text occurs. A replacement candidate to replace the entered text is determined based on the entered text and the context data.
Who is the assignee on this patent?
Ramerth Brent D, Davidson Douglas R, Moore Jennifer Lauren, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F40/232. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).