Hierarchical Context Specific Actions from Ambient Speech
US-2023153061-A1 · May 18, 2023 · US
US12300217B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12300217-B2 |
| Application number | US-202117342505-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 8, 2021 |
| Priority date | Jun 8, 2021 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for speech recognition correction include receiving a voice recognition input from an individual user and using a trained error correction model to add a new alternative result to a results list based on the received voice input processed by a voice recognition system. The error correction model is trained using contextual information corresponding to the individual user. The contextual information comprises a plurality of historical user correction logs, a plurality of personal class definitions, and an application context. A re-ranker re-ranks the results list with the new alternative result and a top result from the re-ranked results list is output.
Opening claim text (preview).
What is claimed is: 1. A computerized method for speech recognition correction, the computerized method comprising: receiving a voice recognition input from an individual user, wherein the voice recognition input comprises speech processed by a voice recognition system to generate a results list during a first-pass speech recognition process; using an error correction model to add a new alternative result to the results list during a second-pass error correction process based on the voice recognition input processed by the voice recognition system, the error correction model trained using contextual information corresponding to the individual user, the contextual information comprising a plurality of historical user correction logs, a plurality of personal class definitions including a personal contacts list of the individual user and a business contacts list of the individual user, and an application context, wherein the business contacts list of the individual user is derived from a business application used by the individual user, and wherein using the error correction model to add the new alternative result includes adding a contact name from the personal contacts list of the individual user as the new alternative result; using a re-ranker to re-rank the results list with the new alternative result, wherein the new alternative result is assigned a higher priority when the contact name is included in the business contacts list of the individual user; and outputting a top result from the re-ranked results list. 2. The computerized method of claim 1 , wherein the voice recognition input processed by the voice recognition system generates a first-pass recognition lattice, and the results list is derived from the first-pass recognition lattice and ordered by acoustic and language model scores, and the computerized method further comprising introducing at least one new path in the first-pass recognition lattice to generate an updated recognition lattice. 3. The computerized method of claim 2 , wherein the results list comprises an N-best results list, and the computerized method further comprising generating an updated N-best results list based on the updated recognition lattice and a plurality of personalized error corrections. 4. The computerized method of claim 3 , wherein the re-ranker uses a re-ranking algorithm that processes the results list from a first-pass and at least one second-pass feature related to the plurality of historical user correction logs and the plurality of personal class definitions to re-order the N-best results list in a second-pass scoring process. 5. The computerized method of claim 1 , further comprising outputting a plurality of top results and displaying the plurality of top results to the individual user. 6. The computerized method of claim 5 , wherein the voice recognition input comprises a user query and the plurality of top results comprises a plurality of top candidate search queries, and the computerized method further comprising using the plurality of top candidate search queries in a subsequent processing stage to perform an online search. 7. The computerized method of claim 1 , wherein the plurality of personal class definitions comprises at least personal contact information comprising personal entities of the individual user, custom folders in an email program, slide deck names, media and metadata of the individual user, and a user generated schema related to folder collections or media collections. 8. The computerized method of claim 7 , further comprising interfacing with at least one of a business application or a personal application to change at least one of a weight or a priority of an entity within the personal contact information. 9. The computerized method of claim 1 , wherein the error correction model comprises a dynamic speech detection correction model corresponding to only the individual user. 10. The computerized method of claim 1 , wherein using the historical user correction logs to train the error correction model personalizes the error correction model to the individual user. 11. The computerized method of claim 1 , wherein the error correction model is trained based on a combination of contextual information corresponding to the plurality of historical user correction logs, the plurality of personal class definitions, and the application context. 12. A system for speech recognition correction, the system comprising: at least one processor; and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the at least one processor to: receive a voice recognition input from an individual user, wherein the voice recognition input comprises speech processed by a voice recognition system to generate a results list during a first-pass speech recognition process; use an error correction model to add a new alternative result to the results list during a second-pass error correction process based on the voice recognition input processed by the voice recognition system, the error correction model trained using contextual information corresponding to the individual user, the contextual information comprising a plurality of historical user correction logs, a plurality of personal class definitions including a personal contacts list of the individual user and a business contacts list of the individual user, and an application context,, wherein the business contacts list of the individual user is derived from a business application used by the individual user, and wherein using the error correction model to add the new alternative result includes adding a contact name from the personal contacts list of the individual user as the new alternative result; use a re-ranker to re-rank the results list with the new alternative result, wherein the new alternative result is assigned a higher priority when the contact name is included in the business contacts list of the individual user; and output a top result from the re-ranked results list. 13. The system of claim 12 , wherein the voice recognition input processed by the voice recognition system generates a first-pass recognition lattice, and the results list is derived from the first-pass recognition lattice and ordered by acoustic and language model scores, and further comprising introducing at least one new path in the first-pass recognition lattice to generate an updated recognition lattice. 14. The system of claim 13 , wherein the results list comprises an N-best results list, and further comprising generating an updated N-best results list based on the updated recognition lattice and a plurality of personalized error corrections. 15. The system of claim 14 , wherein the re-ranker uses a re-ranking algorithm that processes the results list from a first-pass and at least one second-pass feature related to the plurality of historical user correction logs and the plurality of personal class definitions to re-order the N-best results list in a second-pass scoring process. 16. The system of claim 12 , wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the at least one processor to output a plurality of top results and displaying the plurality of top results to the individual user. 17. The system of claim 16 , wherein the voice recognition input comprises a user query and the plurality of top results comprises a plurality of top candidate search queries, and the at least one memory and the computer program code configured to, wi
Supervised learning · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
using context dependencies, e.g. language models · CPC title
Training · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.