Methods, systems and apparatuses for improved speech recognition and transcription
US-11551694-B2 · Jan 10, 2023 · US
US12579982B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12579982-B2 |
| Application number | US-202318521683-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 28, 2023 |
| Priority date | Jan 5, 2021 |
| Publication date | Mar 17, 2026 |
| Grant date | Mar 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatuses for improved speech recognition and transcription of user utterances are described herein. A user utterance may be processed by a speech recognition computing device. One or more acoustic features associated with the user utterance may be used to determine whether one or more actions are to be performed based on a transcription of the user utterance.
Opening claim text (preview).
The invention claimed is: 1 . A method comprising: determining, by a computing device, based on one or more acoustic features associated with a user utterance, that one or more override triggering rules associated with a transcription of the user utterance are satisfied; and causing, based on an updated transcription of the user utterance, at least one action to be performed, wherein the updated transcription is generated based on the one or more override triggering rules being satisfied. 2 . The method of claim 1 , wherein determining that the one or more override triggering rules are satisfied comprises: determining, based on the one or more acoustic features, that a level of confidence associated with the one or more acoustic features meets or exceeds a confidence threshold associated with the one or more override triggering rules. 3 . The method of claim 1 , wherein causing the at least one action to be performed comprises at least one of: causing the updated transcription to be output by a user device; causing the updated transcription to be sent to the user device; causing the computing device to perform at least one command indicated by the updated transcription; or causing the user device to perform the at least one command indicated by the updated transcription. 4 . The method of claim 1 , further comprising at least one of: receiving, from a speech recognition computing device, the transcription of the user utterance; based on the one or more override triggering rules being satisfied, generating, by the speech recognition computing device, the updated transcription of the user utterance; or receiving, from the speech recognition computing device, the updated transcription of the user utterance. 5 . The method of claim 1 , further comprising at least one of: receiving an indication of the one or more acoustic features associated with the user utterance; or determining, based on an acoustic model, the one or more acoustic features associated with the user utterance. 6 . The method of claim 1 , further comprising: determining, based on historical user utterance data, a plurality of user utterances comprising the one or more acoustic features; and determining, based on the plurality of user utterances, an acoustic model, wherein the acoustic model is configured to override erroneous transcriptions of user utterances comprising the one or more acoustic features. 7 . The method of claim 6 , further comprising: determining, based on the historical user utterance data, the one or more override triggering rules, wherein the acoustic model is configured to override erroneous transcriptions of user utterances based on the one or more override triggering rules. 8 . One or more non-transitory computer-readable media storing processor-executable instructions that, when executed by one or more processors, cause the one or more processors to: determine, based on one or more acoustic features associated with a user utterance, that one or more override triggering rules associated with a transcription of the user utterance are satisfied; and cause, based on an updated transcription of the user utterance, at least one action to be performed, wherein the updated transcription is generated based on the one or more override triggering rules being satisfied. 9 . The one or more non-transitory computer-readable media of claim 8 , wherein the processor-executable instructions that cause the one or more processors to determine that the one or more override triggering rules are satisfied further cause the one or more processors to: determine, based on the one or more acoustic features, that a level of confidence associated with the one or more acoustic features meets or exceeds a confidence threshold associated with the one or more override triggering rules. 10 . The one or more non-transitory computer-readable media of claim 8 , wherein the processor-executable instructions that cause the one or more processors to cause the at least one action to be performed further cause the one or more processors to one or more of: cause the updated transcription to be output by a user device; cause the updated transcription to be sent to the user device; perform at least one command indicated by the updated transcription; or cause the user device to perform the at least one command indicated by the updated transcription. 11 . The one or more non-transitory computer-readable media of claim 8 , wherein the processor-executable instructions further cause the one or more processors to one or more of: receive, from a speech recognition computing device, the transcription of the user utterance; based on the one or more override triggering rules being satisfied, cause the speech recognition computing device to generate the updated transcription of the user utterance; or receive, from the speech recognition computing device, the updated transcription of the user utterance. 12 . The one or more non-transitory computer-readable media of claim 8 , wherein the processor-executable instructions further cause the one or more processors to one or more of: receive an indication of the one or more acoustic features associated with the user utterance; or determine, based on an acoustic model, the one or more acoustic features associated with the user utterance. 13 . The one or more non-transitory computer-readable media of claim 8 , wherein the processor-executable instructions further cause the one or more processors to: determine, based on historical user utterance data, a plurality of user utterances comprising the one or more acoustic features; and determine, based on the plurality of user utterances, an acoustic model, wherein the acoustic model is configured to override erroneous transcriptions of user utterances comprising the one or more acoustic features. 14 . The one or more non-transitory computer-readable media of claim 13 , wherein the processor-executable instructions further cause the one or more processors to: determine, based on the historical user utterance data, the one or more override triggering rules, wherein the acoustic model is configured to override erroneous transcriptions of user utterances based on the one or more override triggering rules. 15 . An apparatus comprising: at least one processor; and memory storing processor-executable instructions that, when executed by the at least one processor, cause the apparatus to: determine, based on one or more acoustic features associated with a user utterance, that one or more override triggering rules associated with a transcription of the user utterance are satisfied; and cause, based on an updated transcription of the user utterance, at least one action to be performed, wherein the updated transcription is generated based on the one or more override triggering rules being satisfied. 16 . The apparatus of claim 15 , wherein the processor-executable instructions that cause the apparatus to determine that the one or more override triggering rules are satisfied further cause the apparatus to: determine, based on the one or more acoustic features, that a level of confidence associated with the one or more acoustic features meets or exceeds a confidence threshold associated with the one or more override triggering rules. 17 . The apparatus of claim 15 , wherein the processor-executable instructions that cause the apparatus to cause the at least one action to be performed further cause the apparatus to one or more of: cause the updated transcription to be output by a
Speech classification or search · CPC title
Word spotting · CPC title
Assessment or evaluation of speech recognition systems · CPC title
Execution procedure of a spoken command · CPC title
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.