Speech recognition method and apparatus
US-2018173494-A1 · Jun 21, 2018 · US
US11710488B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11710488-B2 |
| Application number | US-201816975677-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 19, 2018 |
| Priority date | Feb 26, 2018 |
| Publication date | Jul 25, 2023 |
| Grant date | Jul 25, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to a first speech recognition system to generate a first transcript based on the audio data and directing the first transcript to the second device. The method may also include in response to obtaining a quality indication regarding a quality of the first transcript, multiplexing the audio data to provide the audio data to a second speech recognition system to generate a second transcript based on the audio data while continuing to provide the audio data to the first speech recognition system and direct the first transcript to the second device, and in response to obtaining a transfer indication that occurs after multiplexing of the audio data, directing the second transcript to the second device instead of the first transcript.
Opening claim text (preview).
The invention claimed is: 1. A method to transcribe communications, the method comprising: obtaining, at a transcription system, audio data; directing, from the transcription system, a first transcript of the audio data from a first speech recognition system of the transcription system to a device remote from the transcription system; obtaining, at the transcription system, a quality indication of the first transcript; in response to obtaining the quality indication, providing, by the transcription system, the audio data to a second speech recognition system of the transcription system to generate a second transcript of the audio data while continuing to provide the audio data to the first speech recognition system to generate the first transcript and continuing to direct the first transcript from the transcription system to the device; and in response to obtaining the second transcript at the transcription system and in response to an occurrence of an event that is used to indicate the second transcript is to be directed to the device, directing, from the transcription system, the second transcript to the device instead of the first transcript, wherein the event includes one of the following: the audio data not including spoken words for a first time period, the first transcript including first text and then for a second time period the audio data not including spoken words, and a last phrase of the first transcript is the same as a last phrase of the second transcript. 2. The method of claim 1 , wherein the first speech recognition system and the second speech recognition system are automated speech recognition systems that work independent of human interaction. 3. The method of claim 1 , wherein the first speech recognition system is an automated speech recognition system that works independent of human interaction and the generation of the second transcript by the second speech recognition system includes: broadcasting audio based on the audio data; and obtaining second audio data based on a re-voicing of the broadcast audio, wherein the second transcript is generated based on the second audio data. 4. The method of claim 1 , further comprising: obtaining a confidence score of the first transcript from the first speech recognition system; and obtaining the quality indication based on a comparison of the confidence score to a quality threshold. 5. The method of claim 1 , wherein the quality indication is obtained from the device. 6. The method of claim 1 , further comprising in response to the occurrence of the event, ceasing providing the audio data to the first speech recognition system. 7. At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 1 . 8. A transcription system comprising: at least one processor; and at least one non-transitory computer-readable media communicatively coupled to the at least one processor and configured to store one or more instructions that when executed by the at least one processor cause or direct the system to perform operations comprising: obtain, at the transcription system, audio data; direct, from the transcription system, a first transcript of the audio data from a first speech recognition system of the transcription system to a device remote from the transcription system; obtain, at the transcription system, a quality indication of the first transcript; in response to obtaining the quality indication, provide, by the transcription system, the audio data to a second speech recognition system of the transcription system to generate a second transcript of the audio data while continuing to provide the audio data to the first speech recognition system to generate the first transcript and continuing to direct the first transcript from the transcription system to the device; and in response to obtaining the second transcript at the transcription system and in response to an occurrence of an event that is used to indicate the second transcript is to be directed to the device, direct, from the transcription system, the second transcript to the device instead of the first transcript, wherein the event includes one of the following: the audio data not including spoken words for a first time period, the first transcript including first text and then for a second time period the audio data not including spoken words, and a last phrase of the first transcript is the same as a last phrase of the second transcript. 9. The system of claim 8 , wherein the first speech recognition system and the second speech recognition system are automated speech recognition systems that work independent of human interaction. 10. The system of claim 8 , wherein the first speech recognition system is an automated speech recognition system that works independent of human interaction and the generation of the second transcript by the second speech recognition system includes operations comprising: broadcast audio based on the audio data; and obtain second audio data based on a re-voicing of the broadcast audio, wherein the second transcript is generated based on the second audio data. 11. The system of claim 8 , wherein the operations further comprise: obtain a confidence score of the first transcript from the first speech recognition system; and obtain the quality indication based on a comparison of the confidence score to a quality threshold. 12. The system of claim 8 , wherein the quality indication is obtained from the device. 13. The system of claim 8 , wherein the operations further comprise in response to the occurrence of the event, cease providing the audio data to the first speech recognition system. 14. A method comprising: obtaining, at a device from a remote transcription system over a network, a first transcript of audio data from a first speech recognition system of the transcription system; and obtaining, at the device, a second transcript of the audio data from a second speech recognition system of the transcription system instead of the first transcript in response to the second transcript being generated and in response to an occurrence of an event that is used to indicate the second transcript is to be directed to the device from the transcription system, wherein the second transcript is generated in response to a quality indication of the first transcript being below a threshold and while the first speech recognition system continues to generate the first transcript and device continues to obtain the first transcript, and wherein the event includes one of the following: the audio data not including spoken words for a first time period, the first transcript including first text and then for a second time period the audio data not including spoken words, and a last phrase of the first transcript is the same as a last phrase of the second transcript. 15. The method of claim 14 , further comprising directing the quality indication from the device to another system that directs the generation of the second transcript. 16. The method of claim 14 , further comprising presenting the first transcript before obtaining the second transcript. 17. The method of claim 14 , wherein the first transcript and the second transcript are aligned such that all words represented in the audio data are in one of either the first transcript or the second transcript. 18. The method of claim 14 , wherein the first transcript and the second transcript are aligned such that words from the fir
Related publications grouped by family.
Answers are generated from the same data shown on this page.