Generating structured text content using speech recognition models
US-2018150605-A1 · May 31, 2018 · US
US11211046B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11211046-B2 |
| Application number | US-202016740761-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 13, 2020 |
| Priority date | Jan 7, 2018 |
| Publication date | Dec 28, 2021 |
| Grant date | Dec 28, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mistranscription generated by a speech recognition system is identified. A received utterance is matched to a first utterance member within a set of known utterance members. The matching operation matches fewer than the first plural number of words in the received utterance and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member. The received utterance is sent to a mistranscription analyzer component which increments evidence that the received utterance is evidence of a mistranscription. Once the incremented evidence for the mistranscription exceeds a threshold, future received utterances containing the mistranscription are treated as though the first word was recognized.
Opening claim text (preview).
Having described our invention, what we now claim is as follows: 1. A method for identification of a mistranscription generated by a speech recognition system comprising: matching a received utterance to a first utterance member within a set of known utterance members, wherein fewer than a first number of words in the received utterance are matched to the first number of words in the first utterance member and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member; sending the received utterance to a mistranscription analyzer component; incrementing evidence, by the mistranscription analyzer, that the received utterance is evidence of a mistranscription, wherein the evidence is an occurrence of one of a substitution error, a deletion error or an insertion error; and responsive to incremented evidence for the mistranscription exceeding a threshold, treating a future received utterance containing the mistranscription as though the first word was recognized in the future received utterance. 2. The method as recited in claim 1 , wherein the received utterance contains a replacement error uses a second word in place of the first word used in a first slot in the first utterance member; wherein the incremented evidence by the mistranscription analyzer is that the received utterance is evidence of a mistranscription replacing the second word for the first word. 3. The method as recited in claim 1 , further comprising: responsive to matching a second received utterance to the first utterance member, sending the second received utterance to a mistranscription analyzer, wherein the matching matches a first plurality of the words, and a second plurality of the remaining words in the received utterance are candidate mistranscriptions; generating a first synthetic utterance via a text-to-speech sub-system of an audio stream based on replacing a first contiguous set of words assumed to be a mistranscription from the second plurality of remaining words in the first utterance member with an assumed correct replacement; transmitting the first synthetic utterance to a speech recognition engine with the above correcting feature; and responsive to a correction of the synthetic utterance to the first utterance member, accumulating evidence that the first contiguous set of words is a mistranscription of the assumed correct replacement. 4. The method as recited in claim 1 , wherein the mistranscription analyzer matches a received utterance to a respective utterance member with a different number of words and with a single first candidate mistranscription results in a greater evidence for the first candidate mistranscription that contains one more contiguous words that do not exactly match one or more contiguous words in the respective utterance member. 5. The method as recited in claim 2 , wherein the mistranscription analyzer uses a rule that increments evidence for a mistranscription of the second word for the first word in a second utterance member which also includes the first word, based on the received utterance matching the first utterance member, wherein an amount of evidence incremented for the mistranscription in the second utterance member is less than an amount of evidence incremented for the mistranscription in the first utterance member. 6. The method as recited in claim 1 , wherein the mistranscription analyzer increments evidence for a mistranscription in the first manner at the first slot based on multiple received utterances from a first user having the mistranscription in the first manner at the first slot. 7. The method as recited in claim 2 , further comprising incrementing evidence, by the mistranscription analyzer, that the received utterance is evidence of a mistranscription of the second word for the first word each time a received utterance is matched to the first utterance member so that with each received utterance where the second word is transcribed in place of the first word in the received utterance, the greater the evidence that is accumulated for the mistranscription. 8. The method as recited in claim 1 , wherein the mistranscription analyzer uses a phonetic based rule that a greater degree of phonetic similarity between a second word in the received utterance and the first word at the first slot in the first utterance member results in a greater amount of evidence being incremented per instance of the received utterance than if no such phonetic similarity is detected. 9. The method as recited in claim 1 , wherein the evidence is a deletion error where the first word is omitted in the received utterance at the first slot. 10. Apparatus, comprising: a processor; computer memory holding computer program instructions executed by the processor for identification of a mistranscription generated by a speech recognition system, the computer program instructions comprising: program code, operative to match a received utterance to a first utterance member within a set of known utterance members, wherein fewer than a first number of words in the received utterance are matched to the first number of words in the first utterance member and the received utterance varies in a first particular manner as compared to a first word in a first slot in the first utterance member; program code, operative to send the received utterance to a mistranscription analyzer component; program code, operative to increment evidence, by the mistranscription analyzer, that the received utterance is evidence of a mistranscription, wherein the evidence is an occurrence of one of a substitution error, a deletion error or an insertion error; and program code responsive to incremented evidence for the mistranscription exceeding a threshold, operative to treat a future received utterance containing the mistranscription as though the first word was recognized in the future received utterance. 11. The apparatus as recited in claim 10 , wherein the received utterance contains a replacement error which uses a second word in place of the first word used in a first slot in the first utterance member; wherein the incremented evidence by the mistranscription analyzer is that the received utterance is evidence of a mistranscription replacing the second word for the first word. 12. The apparatus as recited in claim 10 , further comprising: program code responsive to matching a second received utterance to the first utterance member, operative to send the second received utterance to a mistranscription analyzer, wherein the matching matches a first plurality of the words, and a second plurality of the remaining words in the received utterance are candidate mistranscriptions; program code, operative to generate a first synthetic utterance via a text-to-speech sub-system of an audio stream based on replacing a first contiguous set of words assumed to be a mistranscription from the second plurality of remaining words in the first utterance member with an assumed correct replacement; program code, operative to transmit the first synthetic utterance to a speech recognition engine with the above correcting feature; and program code responsive to a correction of the synthetic utterance to the first utterance member, operative to accumulate evidence that the first contiguous set of words is a mistranscription of the assumed correct replacement. 13. The apparatus as recited in claim 11 , wherein the mistranscription analyzer increments evidence for a mistranscription of the second word for the first word wherein evidence of a mistranscription for a first user who uttered the received utterance is greater than eviden
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Assessment or evaluation of speech recognition systems · CPC title
Ensemble learning · CPC title
Machine learning · CPC title
Training · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.