Learning language models from scratch based on crowd-sourced user text input
US-2015309984-A1 · Oct 29, 2015 · US
US10381000B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10381000-B1 |
| Application number | US-201815864689-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jan 8, 2018 |
| Priority date | Feb 29, 2016 |
| Publication date | Aug 13, 2019 |
| Grant date | Aug 13, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Compact finite state transducers (FSTs) for automatic speech recognition (ASR). An HCLG FST and/or G FST may be compacted at training time to reduce the size of the FST to be used at runtime. The compact FSTs may be significantly smaller (e.g., 50% smaller) in terms of memory size, thus reducing the use of computing resources at runtime to operate the FSTs. The individual arcs and states of each FST may be compacted by binning individual weights, thus reducing the number of bits needed for each weight. Further, certain fields such as a next state ID may be left out of a compact FST if an estimation technique can be used to reproduce the next state at runtime. During runtime portions of the FSTs may be decompressed for processing by an ASR engine.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving compressed language model data; detecting audio using a microphone, the audio corresponding to an utterance; determining audio data corresponding to the audio; processing at least a portion of the compressed language model data to determine uncompressed language model data; performing speech recognition using the audio data and the uncompressed language model data to determine text data; deleting the uncompressed language model data from the memory but maintaining a copy of the compressed language model data; and causing a command to be executed using at least the text data. 2. The computer-implemented method of claim 1 , wherein the compressed language model data comprises a portion of a compressed language model. 3. The computer-implemented method of claim 1 , wherein the compressed language model data comprises compressed data corresponding to a finite state transducer (FST). 4. The computer-implemented method of claim 3 , wherein the FST is configured to be traversed using input words and to output words. 5. The computer-implemented method of claim 1 , further comprising: detecting second audio corresponding to a second utterance; determining second audio data corresponding to the second audio; and sending the second audio data to at least one remote device for speech processing. 6. The computer-implemented method of claim 1 , wherein processing the at least a portion of the compressed language model data to determine uncompressed language model data occurs prior to detecting the audio using the microphone. 7. The computer-implemented method of claim 1 , further comprising: receiving an indication from a second device, wherein processing the at least a portion of the compressed language model data to determine uncompressed language model data occurs in response to receiving the indication. 8. The computer-implemented method of claim 7 , wherein the indication corresponds to at least one of: a vehicle starting, a button being pressed, an alarm about to sound, or a delivery person approaching a location. 9. The computer-implemented method of claim 1 , wherein the compressed language model data corresponds to a user profile associated with a device that includes the microphone. 10. The computer-implemented method of claim 1 , further comprising, before processing the at least a portion of the compressed language model data to determine uncompressed language model data: determining that the utterance included a wakeword. 11. A device, comprising: at least one processor; at least one microphone; and memory including instructions operable to be executed by the at least one processor to configure the device to: receive compressed language model data; detect audio using the at least one microphone, the audio corresponding to an utterance; determine audio data corresponding to the audio; process at least a portion of the compressed language model data to determine uncompressed language model data; perform speech recognition using the audio data and the uncompressed language model data to determine text data; delete the uncompressed language model data from the memory but maintain a copy of the compressed language model data; and cause a command to be executed using at least the text data. 12. The device of claim 11 , wherein the compressed language model data comprises a portion of a compressed language model. 13. The device of claim 11 , wherein the compressed language model data comprises compressed data corresponding to a finite state transducer (FST). 14. The device of claim 13 , wherein the FST is configured to be traversed using input words and to output words. 15. The device of claim 11 , wherein the memory further includes instructions that, when executed by the at least one processor further configure the device to: detect second audio corresponding to a second utterance; determine second audio data corresponding to the second audio; and send the second audio data to at least one remote device for speech processing. 16. The device of claim 11 , wherein the memory further includes instructions that, when executed by the at least one processor further configure the device to, before processing the at least a portion of the compressed language model data to determine uncompressed language model data: determine that the utterance included a wakeword. 17. The device of claim 11 , wherein the instructions to process the at least a portion of the compressed language model data to determine uncompressed language model data are executed prior to the instructions to detect the audio using the microphone. 18. The device of claim 11 , wherein the memory further includes instructions that, when executed by the at least one processor further configure the device to: receive an indication from a second device, wherein the instructions to process the at least a portion of the compressed language model data to determine uncompressed language model data are executed in response to receiving the indication. 19. The device of claim 11 , wherein the compressed language model data corresponds to a user profile associated with a device that includes the microphone. 20. The device of claim 11 , wherein the memory further includes instructions that, before processing the at least a portion of the compressed language model data to determine uncompressed language model data: determine that the utterance included a wakeword.
Training · CPC title
Parsing for meaning understanding · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Formal grammars, e.g. finite state automata, context free grammars or word networks · CPC title
updating or merging of old and new templates; Mean values; Weighting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.