Building language models for a user in a social network from linguistic information

US9747895B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9747895-B1
Application numberUS-201313936858-A
CountryUS
Kind codeB1
Filing dateJul 8, 2013
Priority dateJul 10, 2012
Publication dateAug 29, 2017
Grant dateAug 29, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model based on the first linguistic information. The method includes identifying a second group of one or more users associated with the user. The method includes identifying second linguistic information associated with the second group. The method includes generating a second language model based on the second linguistic information. The method includes associating the first language model and the second language model with the user.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving voice data associated with an utterance of a particular user of a social network, the social network comprising multiple groups of users, the users segmented into the groups based on data indicating respective commonalities for the users in each group; identifying multiple language models associated with the social network, each of the multiple language models respectively associated with one or more of the groups of users of the social network, wherein the multiple language models comprise a first language model that was generated based on first linguistic information associated with a first group of the multiple groups of users of the social network, and a second language model that was generated based on second linguistic information associated with a second group of the multiple groups of users of the social network, wherein each of the first group and the second group is part of a social graph for the particular user, the social graph for the particular user determined by a set predetermined number of degrees of separation from the particular user to other users in the social network, and wherein the particular user is associated with a particular cluster based, at least in part, on a probability of the user belonging to the cluster; selecting at least the first language model from among the multiple language models associated with the social network based on a first criteria of a set of selection criteria and the second language model from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria, the second criteria being different from the first criteria, wherein the set of selection criteria comprises a social relationship between the particular user of the social network and another user of the social network with whom the particular user is communicating, a topic of communication associated with the utterance of the particular user of the social network, a context within which the voice data associated with the utterance of the particular user of the social network is received, and a demographic characteristic of the particular user of the social network; aggregating at least the first language model selected from among the multiple language models associated with the social network based on the first criteria of the set of selection criteria and the second language model selected from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria to form an aggregated language model; and generating a transcription of the utterance of the particular user using the aggregated language model. 2. The method of claim 1 , wherein the first linguistic information includes text entered by one or more users of the first group. 3. The method of claim 1 , further comprising: providing the first and second language models to a mobile device associated with the particular user. 4. The method of claim 1 , further comprising: associating the aggregated language model with the particular user. 5. The method of claim 1 , further comprising: distributing the aggregated language model to a client device associated with the particular user. 6. The method of claim 1 , wherein the multiple language models comprise a general language model representative of a particular spoken language, a personal language model representative of speech of a certain user, a social graph language model representative of speech of users associated with a social graph, a context language model representative of speech within a particular context, and a topic language model representative of speech associated with a particular topic. 7. The method of claim 1 , further comprising: determining that another user of the social network is part of at least one of the first group and the second group of the multiple groups of user of the social network; and in response to determining that the other user of the social network is part of at least one of the first group and the second group of the multiple groups of user of the social network, associating the aggregated language model with the other user of the social network to generate a transcription of an utterance of the other user using the aggregated language model. 8. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving voice data associated with an utterance of a particular user of a social network, the social network comprising multiple groups of users, the users segmented into the groups based on data indicating respective commonalities for the users in each group; identifying multiple language models associated with the social network, each of the multiple language models respectively associated with one or more of the groups of users of the social network, wherein the multiple language models comprise a first language model that was generated based on first linguistic information associated with a first group of the multiple groups of users of the social network, and a second language model that was generated based on second linguistic information associated with a second group of the multiple groups of users of the social network, wherein each of the first group and the second group is part of a social graph for the particular user, the social graph for the particular user determined by a set predetermined number of degrees of separation from the particular user to other users in the social network, and wherein the particular user is associated with a particular cluster based, at least in part, on a probability of the user belonging to the cluster; selecting at least the first language model from among the multiple language models associated with the social network based on a first criteria of a set of selection criteria and the second language model from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria, the second criteria being different from the first criteria, wherein the set of selection criteria comprises a social relationship between the particular user of the social network and another user of the social network with whom the particular user is communicating, a topic of communication associated with the utterance of the particular user of the social network, a context within which the voice data associated with the utterance of the particular user of the social network is received, and a demographic characteristic of the particular user of the social network; aggregating at least the first language model selected from among the multiple language models associated with the social network based on the first criteria of the set of selection criteria and the second language model selected from among the multiple language models associated with the social network based on a second criteria of the set of selection criteria to form an aggregated language model; and generating a transcription of the utterance of the particular user using the aggregated language model. 9. The system of claim 8 , wherein the first linguistic information includes text entered by one or more users of the first group. 10. The system of claim 8 , the operations further comprising: providing the first and second language models to a mobile device associated with the particular user. 11. The system of claim 8 , the operations further comprising: associating the aggregated language model with the particular user.

Assignees

Inventors

Classifications

  • G10L15/18Primary

    using natural language modelling · CPC title

  • Physics · mapped topic

  • G10L15/183Primary

    using context dependencies, e.g. language models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9747895B1 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for building language models. One of the methods includes identifying a first group of one or more users associated with a user in a social network. The method includes identifying first linguistic information associated with the first group. The method includes generating a first language model base…
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/18. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 29 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).