What technology area does this patent fall under?

Primary CPC classification G10L15/22. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Context-based utterance recognition

US9633653B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9633653-B1
Application number	US-201514675104-A
Country	US
Kind code	B1
Filing date	Mar 31, 2015
Priority date	Dec 27, 2011
Publication date	Apr 25, 2017
Grant date	Apr 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some implementations, a digital work provider may provide language model information related to a plurality of different contexts, such as a plurality of different digital works. For example, the language model information may include language model difference information identifying a plurality of sequences of one or more words in a digital work that have probabilities of occurrence that differ from probabilities of occurrence in a base language model by a threshold amount. The language model difference information corresponding to a particular context may be used in conjunction with the base language model to recognize an utterance made by a user of a user device. In some examples, the recognition is performed on the user device. In other examples, the utterance and associated context information are sent over a network to a recognition computing device that performs the recognition.

First claim

Opening claim text (preview).

The invention claimed is: 1. A device comprising: a processor; and one or more computer-readable media to store processor-executable instructions that, when executed, program the one or more processors to: identify a plurality of n-grams associated with at least a portion of a digital work, an n-gram of the plurality of n-grams comprising a sequence of at least one or more words, one or more phonemes, one or more syllables, one or more letters, or one or more base pairs; associate a first probability of occurrence with the n-gram based at least in part on a frequency of occurrence in at least the portion of the digital work; determine language model difference information based at least in part on the first probability of occurrence associated with the n-gram differing from a second probability of occurrence of the n-gram in a base language model by more than a threshold amount; and determine a word based at least in part on a captured utterance, the base language model, and the language model difference information. 2. The device as recited in claim 1 , wherein the processor-executable instructions further program the one or more processors to generate the base language model based at least in part on at least one of a webpage, an electronic book, a news feed, a social network site, a microblog, or a closed captioning feed. 3. The device as recited in claim 1 , wherein the processor-executable instructions further program the one or more processors to generate the base language model from a plurality of digital works. 4. The device as recited in claim 1 , further comprising a communication interface, and wherein the processor-executable instructions further program the one or more processors to send, via the communication interface, the language model difference information and the base language model to a speech recognizing computing device that provides an utterance recognition service. 5. The device as recited in claim 1 , further comprising a communication interface, and wherein the processor-executable instructions further program the one or more processors to send, via the communication interface, the language model difference information to a user device in association with providing the digital work to the user device. 6. The device as recited in claim 1 , further comprising a communication interface, and wherein the processor-executable instructions further program the one or more processors to: receive, via the communication interface, the captured utterance from a user device; and send, via the communication interface, information associated with the word to the user device. 7. The device as recited in claim 6 , wherein the processor-executable instructions further program the one or more processors to receive, via the communication interface, context information associated with the utterance, the context information identifying the language model difference information. 8. The device as recited in claim 1 , further comprising a communication interface, and wherein the processor-executable instructions further program the one or more processors to: receive, via the communication interface, user information corresponding to the digital work from a plurality of user devices; and wherein the first probability of occurrence associated with the n-gram is weighted based at least in part on the user information. 9. The device as recited in claim 8 , wherein the user information includes at least one of information corresponding to a user highlight of the digital work or information corresponding to a user annotation to the digital work. 10. A method executable by one or more computing processors to perform operations comprising: identify a plurality of n-grams included in user information associated with at least a portion of a digital work, an n-gram of the plurality of n-grams comprising a sequence of one or more words; associating a first probability of occurrence with the n-gram based at least in part on a frequency of occurrence of the n-gram in at least the user information; determining language model difference information based at least in part on the first probability of occurrence associated with the n-gram differs from a second probability of occurrence of the n-gram in a base language model by more than a threshold amount; and determining a word based at least in part on a captured utterance, the base language model, and the language model difference information. 11. The method as recited in claim 10 , wherein the base language model includes a probability-weighted distribution of n-gram sequences for a language associated with the digital work. 12. The method as recited in claim 10 , wherein determining that the first probability of occurrence associated with the n-gram differs from the second probability of occurrence of the n-gram in the base language model by more than the threshold amount comprises determining that the first probability of occurrence associated with the n-gram differs from the second probability of occurrence of the n-gram in the base language model by more than a predetermined distance between the first probability of occurrence and the second probability of occurrence. 13. The method as recited in claim 10 , further comprising generating the base language model based at least in part on a plurality of digital works. 14. The method as recited in claim 10 , further comprising generating the base language model based at least in part on at least one of a webpage, an electronic book, a news feed, a social network site, a microblog, or a closed captioning feed. 15. The method as recited in claim 10 , further comprising determining language model difference information for the digital work based at least in part on the second probability of occurrence associated with the n-gram differing from the first probability of occurrence of the n-gram by more than the threshold amount. 16. The method as recited in claim 10 , wherein identifying the plurality of n-grams included in the user information associated with the at least the portion of the digital work comprises identifying the plurality of n-grams included in at least one of user highlights, user annotations, or user-created content. 17. One or more non-transitory computer-readable media maintaining instructions executable by one or more processors to perform operations comprising: determining an n-gram comprising a sequence of one or more words based at least in part on parsing a plurality of digital works, wherein the plurality of digital works are associated with a particular subject matter category; associating a first probability of occurrence with the n-gram based at least in part on a frequency of occurrence of the n-gram in the plurality of digital works; and determining language model difference information based at least in part on the first probability of occurrence associated with the n-gram differs from a second probability of occurrence of the n-gram in a base language model by more than a threshold amount; and determining a word based at least in part on a captured utterance, the base language model and the language model difference information. 18. The one or more non-transitory computer-readable media as recited in claim 17 , wherein the base language model includes a probability-weighted distribution of n-gram sequences for a language associated with the plurality of digital works. 19. The one or more non-transitory computer-readable media as recited in claim 17 , the operations further comprising: sending the language mod

Assignees

Amazon Tech Inc

Inventors

Porter Brandon William

Classifications

G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/1815Primary
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
G10L15/197
Probabilistic grammars, e.g. word n-grams · CPC title
G10L2015/228
of application context · CPC title
G10L15/063
Training · CPC title

Patent family

Related publications grouped by family.

View patent family 52782340

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9633653B1 cover?: In some implementations, a digital work provider may provide language model information related to a plurality of different contexts, such as a plurality of different digital works. For example, the language model information may include language model difference information identifying a plurality of sequences of one or more words in a digital work that have probabilities of occurrence that di…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).