Clustering classes in language modeling
US-9529898-B2 · Dec 27, 2016 · US
US9747895B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9747895-B1 |
| Application number | US-201313936858-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jul 8, 2013 |
| Priority date | Jul 10, 2012 |
| Publication date | Aug 29, 2017 |
| Grant date | Aug 29, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model based on the first linguistic information. The method includes identifying a second group of one or more users associated with the user. The method includes identifying second linguistic information associated with the second group. The method includes generating a second language model based on the second linguistic information. The method includes associating the first language model and the second language model with the user.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving voice data associated with an utterance of a particular user of a social network, the social network comprising multiple groups of users, the users segmented into the groups based on data indicating respective commonalities for the users in each group; identifying multiple language models associated with the social network, each of the multiple language models respectively associated with one or more of the groups of users of the social network, wherein the multiple language models comprise a first language model that was generated based on first linguistic information associated with a first group of the multiple groups of users of the social network, and a second language model that was generated based on second linguistic information associated with a second group of the multiple groups of users of the social network, wherein each of the first group and the second group is part of a social graph for the particular user, the social graph for the particular user determined by a set predetermined number of degrees of separation from the particular user to other users in the social network, and wherein the particular user is associated with a particular cluster based, at least in part, on a probability of the user belonging to the cluster; selecting at least the first language model from among the multiple language models associated with the social network based on a first criteria of a set of selection criteria and the second language model from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria, the second criteria being different from the first criteria, wherein the set of selection criteria comprises a social relationship between the particular user of the social network and another user of the social network with whom the particular user is communicating, a topic of communication associated with the utterance of the particular user of the social network, a context within which the voice data associated with the utterance of the particular user of the social network is received, and a demographic characteristic of the particular user of the social network; aggregating at least the first language model selected from among the multiple language models associated with the social network based on the first criteria of the set of selection criteria and the second language model selected from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria to form an aggregated language model; and generating a transcription of the utterance of the particular user using the aggregated language model. 2. The method of claim 1 , wherein the first linguistic information includes text entered by one or more users of the first group. 3. The method of claim 1 , further comprising: providing the first and second language models to a mobile device associated with the particular user. 4. The method of claim 1 , further comprising: associating the aggregated language model with the particular user. 5. The method of claim 1 , further comprising: distributing the aggregated language model to a client device associated with the particular user. 6. The method of claim 1 , wherein the multiple language models comprise a general language model representative of a particular spoken language, a personal language model representative of speech of a certain user, a social graph language model representative of speech of users associated with a social graph, a context language model representative of speech within a particular context, and a topic language model representative of speech associated with a particular topic. 7. The method of claim 1 , further comprising: determining that another user of the social network is part of at least one of the first group and the second group of the multiple groups of user of the social network; and in response to determining that the other user of the social network is part of at least one of the first group and the second group of the multiple groups of user of the social network, associating the aggregated language model with the other user of the social network to generate a transcription of an utterance of the other user using the aggregated language model. 8. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving voice data associated with an utterance of a particular user of a social network, the social network comprising multiple groups of users, the users segmented into the groups based on data indicating respective commonalities for the users in each group; identifying multiple language models associated with the social network, each of the multiple language models respectively associated with one or more of the groups of users of the social network, wherein the multiple language models comprise a first language model that was generated based on first linguistic information associated with a first group of the multiple groups of users of the social network, and a second language model that was generated based on second linguistic information associated with a second group of the multiple groups of users of the social network, wherein each of the first group and the second group is part of a social graph for the particular user, the social graph for the particular user determined by a set predetermined number of degrees of separation from the particular user to other users in the social network, and wherein the particular user is associated with a particular cluster based, at least in part, on a probability of the user belonging to the cluster; selecting at least the first language model from among the multiple language models associated with the social network based on a first criteria of a set of selection criteria and the second language model from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria, the second criteria being different from the first criteria, wherein the set of selection criteria comprises a social relationship between the particular user of the social network and another user of the social network with whom the particular user is communicating, a topic of communication associated with the utterance of the particular user of the social network, a context within which the voice data associated with the utterance of the particular user of the social network is received, and a demographic characteristic of the particular user of the social network; aggregating at least the first language model selected from among the multiple language models associated with the social network based on the first criteria of the set of selection criteria and the second language model selected from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria to form an aggregated language model; and generating a transcription of the utterance of the particular user using the aggregated language model. 9. The system of claim 8 , wherein the first linguistic information includes text entered by one or more users of the first group. 10. The system of claim 8 , the operations further comprising: providing the first and second language models to a mobile device associated with the particular user. 11. The system of claim 8 , the operations further comprising: associating the aggregated language model with the particular user.
using natural language modelling · CPC title
Physics · mapped topic
using context dependencies, e.g. language models · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.