Updating population language models based on changes made by user clusters

US9672818B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9672818-B2
Application numberUS-201313869919-A
CountryUS
Kind codeB2
Filing dateApr 24, 2013
Priority dateApr 18, 2013
Publication dateJun 6, 2017
Grant dateJun 6, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technology for improving the predictive accuracy of input word recognition on a device by dynamically updating the lexicon of recognized words based on the word choices made by similar users. The technology collects users' vocabulary choices (e.g., words that each user uses, or adds to or removes from a word recognition dictionary), associates users who make similar choices, aggregates related vocabulary choices, filters the words, and sends words identified as likely choices for that user to the user's device. Clusters may include, for example, users in a particular location (e.g., sets of people who use words such as “Puyallup,” “Gloucester,” or “Waiheke”), users with a particular professional or hobby vocabulary, or application-specific vocabulary (e.g., word choices in map searches or email messages).

First claim

Opening claim text (preview).

We claim: 1. A method in a computing system of selectively updating population language models used by language recognition systems, the method comprising: receiving events for representing user changes to local language models, wherein the events are received from devices of a plurality of users; receiving information characterizing the plurality of users, wherein the information includes social-networking friend data for the users; identifying, from the received information, a user cluster based on the social-networking friend data, wherein the user cluster is for representing a subset of users sharing matching or associated instances of the social-networking friend data, and wherein the identifying is performed by a hardware processor; generating or updating a population language model for the user cluster, wherein the generating or updating of the population language model includes: identifying a subset of the events associated with the user cluster for initiating the generation or update of the population language model, and filtering the events by excluding events associated with a blacklist of vocabulary not to be added or events associated with a whitelist of vocabulary not to be deleted; aggregating the user changes and associated words corresponding to the subset of the events; and providing the population language model or updates thereof to a computing device of one or more users in the user cluster for subsequently recognizing input information provided to the computing device by the one or more users in the user cluster. 2. The method of claim 1 , wherein receiving events associated with local language models comprises receiving information about responses to word suggestions, addition of words to the local language model, or deletion of words from the local language model. 3. The method of claim 2 , wherein information about word usage includes frequencies of use of individual words, word pairs (bigrams), triplets (trigrams), or higher-order n-grams. 4. The method of claim 2 , wherein the events are received on a continuous, periodic, or irregular basis. 5. The method of claim 1 , wherein the users in the cluster share a location. 6. The method of claim 1 , wherein the users in the cluster share a common interest. 7. The method of claim 1 , wherein identifying a cluster of users sharing similar characteristics further comprises identifying similar events associated with local language models. 8. The method of claim 7 , wherein identifying similar events comprises determining word or n-gram associations. 9. The method of claim 1 , wherein generating or updating the population language model includes generating or updating the population language model based on other words connected to or inferred from the words corresponding to the subset of the events, wherein the other words are determined based on word-association for the words corresponding to the subset of the events or based on an inferential analysis of the corresponding words. 10. The method of claim 1 , wherein generating modifications to a population language model associated with the cluster of users comprises adding words or n-grams to the population language model or removing words or n-grams from the population language model. 11. The method of claim 1 , wherein generating modifications to a population language model associated with the cluster comprises modifying weighting of words or n-grams in the population language model. 12. The method of claim 1 , wherein providing generated modifications to the computing device of one or more users in the cluster further comprises providing the associated population language model to a computing device of a user that is newly identified as having a connection to the cluster. 13. The method of claim 1 , wherein providing generated modifications to a computing device of one or more users in the cluster comprises providing updates to a local population language model in a computing device of a user. 14. The method of claim 1 , wherein aggregating the events to identify modifications to the population model comprises prioritizing events based on frequency of event. 15. The method of claim 1 , wherein providing the generated modifications comprises prioritizing modifications by: determining that a modification is of high importance; and selectively providing the high-importance modification to the computing device of one or more users in the cluster. 16. A non-transitory computer-readable memory containing instructions that, when executed by a computing system, implement a method of selectively updating local language models used to predictively complete user input, the method comprising: receiving information about local language models including a first local language model and other local language models, wherein the information about local language models includes information about local language model events representing user changes to at least one of the local language models; receiving information characterizing users associated with the user changes, wherein the information includes social-networking friend data for the users; identifying a user cluster from the received characterization information, wherein the user cluster is for representing a subset of users sharing matching or associated instances of the social-networking friend data; identifying a set of the local language models including the first local language model and one or more other local language models, based at least in part on the other local language models having local language model events similar to first local language model events; for each local language model in the set, identifying additional local language model events; generating modifications to the first local language model using the additional local language model event information of one or more of the identified other local language models in the set, and also using the user cluster, wherein the generation of the modifications is initiated based on receiving the local language model events; filtering the generated modifications by excluding events associated with a blacklist of vocabulary not to be added or events associated with a whitelist of vocabulary not to be deleted before updating the first local language model with the generated modifications; and updating the first local language model with the generated modifications based on providing the generated modifications to a computing device of one or more users in the user cluster. 17. The non-transitory computer-readable memory of claim 16 , wherein receiving information about local language models comprises receiving a change log, local language models, or a combination thereof. 18. The non-transitory computer-readable memory of claim 16 , wherein generating the modifications includes generating the modifications based on other words connected to or inferred from words corresponding to the local language model events, wherein the other words are determined based on word-association for the words corresponding to the local language model events or based on an inferential analysis of the corresponding words. 19. The non-transitory computer-readable memory of claim 16 , wherein information about local language model events comprises information about addition of words to the local language model or deletion of words from the local language model. 20. The non-transitory computer-readable memory of claim 19 , wherein information about word usage includes frequencies of individua

Assignees

Inventors

Classifications

  • G10L15/18Primary

    using natural language modelling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9672818B2 cover?
Technology for improving the predictive accuracy of input word recognition on a device by dynamically updating the lexicon of recognized words based on the word choices made by similar users. The technology collects users' vocabulary choices (e.g., words that each user uses, or adds to or removes from a word recognition dictionary), associates users who make similar choices, aggregates related …
Who is the assignee on this patent?
Nuance Communications Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).