User mediation for hotword/keyword detection
US-2024355324-A1 · Oct 24, 2024 · US
US9047868B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9047868-B1 |
| Application number | US-201213563648-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jul 31, 2012 |
| Priority date | Jul 31, 2012 |
| Publication date | Jun 2, 2015 |
| Grant date | Jun 2, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A specific language model for speech recognition may be built. In some embodiments, the specific language model is associated with a user and built using a corpus of text obtained from a user computing device. In some embodiments, a sequence of words is constructed from the corpus of text. The sequence of words may be obfuscated, and the obfuscated sequence of words may be stored in the specific language model. A server or a user device may use the specific language model in conjunction with a general language model to perform speech recognition on an utterance made by the user.
Opening claim text (preview).
The invention claimed is: 1. A system comprising: an electronic data store configured to store data related to a user; and a text processing device in communication with the electronic data store, the text processing device configured to: obtain text data from a user computing device; construct an n-gram based on the text data; generate, for the n-gram, a hash value; determine, for the n-gram, a first count indicating a number of times the n-gram appears within the text data; add the first count to a second count, wherein the second count is associated with the hash value; and store the hash value and the second count in the electronic data store; and transmit the hash value and the second count to a server device over a network, wherein the server device is configured to: receive the hash value and the second count; determine a specific n-gram probability using the hash value and the second count; obtain a general n-gram probability from a general language model; determine a combined n-gram probability using the specific n-gram probability and the general n-gram probability; and perform speech recognition using the combined n-gram probability. 2. The system of claim 1 , wherein the text processing device is configured to obtain text data from the user computing device by monitoring network traffic generated by the user computing device. 3. The system of claim 1 , wherein the text processing device is configured to obtain text data from data received directly from the user computing device. 4. The system of claim 1 , wherein a number of bits of the hash value is based on a privacy setting. 5. The system of claim 1 , wherein the n-gram comprises a trigram. 6. The system of claim 1 , wherein the text processing device is further configured to store metadata associated with the n-gram in the electronic data store, wherein metadata associated with the n-gram is one of a source of the n-gram, information about an entity associated with the n-gram, or a date and time when the text data was obtained. 7. A non-transitory computer-readable medium comprising a module configured to execute in one or more processors of a computing device, the module being further configured to: receive text data, wherein the text data originated from a user computing device; construct an n-gram based on the text data; generate, for the n-gram, a hash value, wherein a number of bits of the hash value is based on a privacy setting; determine language model information associated with the hash value; store the hash value and the language model information in an electronic data store; and transmit the hash value and the language model information to a server device. 8. The non-transitory computer-readable medium of claim 7 , wherein the language model information is a count. 9. The non-transitory computer-readable medium of claim 7 , wherein the language model information is a tri-gram probability. 10. The non-transitory computer-readable medium of claim 7 , wherein the server device is configured to store the hash value and the language model information in a language model. 11. The non-transitory computer-readable medium of claim 7 , wherein the module is further configured to monitor network traffic generated by the user computing device, and wherein the text data comprises network traffic generated by the user computing device. 12. The non-transitory computer-readable medium of claim 7 , wherein the network interface module is further configured to receive the text data directly from the user computing device. 13. The non-transitory computer readable medium of claim 7 , wherein the computing device comprises the user computing device. 14. The non-transitory computer-readable medium of claim 7 , wherein the language model building module is further configured to store metadata associated with the n-gram in the electronic data store, wherein the metadata associated with the n-gram comprises a source from which the n-gram is generated, an entity associated with the n-gram, or a date and time when the text data was generated. 15. The non-transitory computer readable medium of claim 7 , wherein the text data comprises at least one of an electronic message, a document, a music library, or a contact list. 16. A computer-implemented method comprising: as implemented by a server device configured with specific computer-executable instructions, obtaining a first n-gram probability for an n-gram from a first language model; determining, by the server device executing a language model building module, a hash value for the n-gram; obtaining a second n-gram probability from a second language model using the hash value; determining a third n-gram probability using the first n-gram probability and the second n-gram probability; and performing speech recognition using the third n-gram probability. 17. The computer-implemented method of claim 16 , wherein determining the third n-gram probability comprises using language model interpolation. 18. The computer-implemented method of claim 16 , wherein obtaining the first n-gram probability comprises determining a second hash value for the n-gram and obtaining the first n-gram probability from a language model using the second hash value. 19. The computer-implemented method of claim 18 , wherein the second hash value has a larger number of bits than the hash value. 20. The computer-implemented method of claim 16 , wherein the first language model is a general purpose language model and the second language model is a specific language model adapted to a user. 21. The computer-implemented method of claim 16 , wherein determining a third n-gram probability using the first n-gram probability and the second n-gram probability comprises performing a batch operation prior to performing speech recognition. 22. A system comprising: a server device configured to: obtain a first n-gram probability for an n-gram from a first language model; determine a hash value for the n-gram; obtain a second n-gram probability from a second language model using the hash value; determine a third n-gram probability using the first n-gram probability and the second n-gram probability; and perform speech recognition using the third n-gram probability. 23. The system of claim 22 , wherein the server device is configured to determine the third n-gram probability by using language model interpolation. 24. The system of claim 22 , wherein the server device is configured to obtain the first n-gram probability by determining a second hash value for the n-gram and obtaining the first n-gram probability from a language model using the second hash value. 25. The system of claim 24 , wherein the second hash value has a larger number of bits than the hash value. 26. The system of claim 22 , wherein the first language model is a general purpose language model and the second language model is a specific language model adapted to a user. 27. The system of claim 22 , wherein the server device is configured to determine a third n-gram probability using the first n-gram probability and the second n-gram probability by performing a batch operation prior to performing speech recognition. 28. A system comprising: an electronic data store configured to store data related to a user; and a text processing device in communication with the electronic data store, t
Probabilistic grammars, e.g. word n-grams · CPC title
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.