Method and apparatus for discovering trending terms in speech requests
US-2016078860-A1 · Mar 17, 2016 · US
US11145312B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11145312-B2 |
| Application number | US-202016847200-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 13, 2020 |
| Priority date | Dec 4, 2018 |
| Publication date | Oct 12, 2021 |
| Grant date | Oct 12, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription units, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcript of the first audio data. The method may also include obtaining revoiced audio generated by a revoicing of the first audio data by a captioning assistant and generating a transcription of the revoiced audio using an automatic speech recognition system. The method may further include in response to selecting the revoiced transcription unit, directing the transcription of the revoiced audio to the second device as the transcript of the first audio data.
Opening claim text (preview).
The invention claimed is: 1. A method comprising obtaining, at a system, first audio data during a first communication session that includes a device; selecting, automatically and independently by the system based on a first availability of revoiced transcription units with respect to the first communication session, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcription of the first audio data that is directed to the device; obtaining, by the system, revoiced audio generated by a revoicing of the first audio data; generating, by the system, a transcription of the revoiced audio using an automatic speech recognition system of the revoiced transcription unit; directing, by the system, the transcription of the revoiced audio to the device; obtaining, by the system, second audio data during a second communication session that includes the device and that is different from the first communication session; and selecting, automatically and independently by the system based on a second availability of the revoiced transcription units with respect to the second communication session, the non-revoiced transcription unit instead of one of the revoiced transcription units to generate a transcription of the second audio data to direct to the device such that none of the revoiced transcription units generate a transcription of the second audio data, the second availability of the revoiced transcription units indicating that at least one revoiced transcription unit is available. 2. The method of claim 1 , wherein the first availability of revoiced transcription units is based on one or more of: a current peak number of transcriptions being generated, a current average number of transcriptions being generated, a projected peak number of transcriptions to be generated, a projected average number of transcriptions to be generated, a projected number of the revoiced transcription units, and a number of available revoiced transcription units. 3. The method of claim 1 , wherein in response to the selection of the revoiced transcription unit the non-selected non-revoiced transcription unit does not generate a transcription of the first audio data. 4. The method of claim 1 , wherein the automatic speech recognition system is trained specifically for a speaker used to generate the revoiced audio. 5. The method of claim 4 , wherein a second automatic speech recognition system used by the non-revoiced transcription unit to generate the transcription of the second audio data is trained for a plurality of speakers. 6. The method of claim 1 , At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 7. A method comprising: obtaining, at a system, first audio data during a first communication session that includes a device; selecting, automatically and independently by the system based on a first availability of a plurality of first transcription units, a first transcription unit of the plurality of first transcription units instead of one of a plurality of second transcription units to generate a transcription of the first audio data to direct to the device, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process; in response to selecting the first transcription unit, directing, by the system, the transcription generated by the first transcription unit to the device; obtaining, by the system, second audio data during a second communication session that includes the device and that is different from the first communication session; and selecting, automatically and independently by the system based on a second availability of the plurality of first transcription units, a second transcription unit of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audio data to direct to the device, the second availability of the plurality of first transcription units indicating that at least one of the plurality of first transcription units is estimated to be available for at least a portion of the second communication session. 8. The method of claim 7 , wherein the first availability of the plurality of first transcription units is based on one or more of: a current peak number of transcriptions being generated, a current average number of transcriptions being generated, a projected peak number of transcriptions to be generated, a projected average number of transcriptions to be generated, a projected number of the plurality of first transcription units, and a number of available units of the plurality of first transcription units. 9. The method of claim 7 , wherein the plurality of first transcription units are revoiced transcription units. 10. The method of claim 9 , wherein the plurality of second transcription units are non-revoiced transcription units. 11. The method of claim 9 , further comprising obtaining revoiced audio generated by a revoicing of the first audio data, wherein the selected first transcription unit uses the revoiced audio and an automatic speech recognition system to generate the transcription of the first audio data. 12. The method of claim 11 , wherein the automatic speech recognition system is trained specifically for a speaker used to generate the revoiced audio. 13. The method of claim 12 , wherein a second automatic speech recognition system used by one or more of the plurality of second transcription units is trained for a plurality of speakers. 14. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 7 . 15. A system comprising: at least one computing system; and at least one computer-readable media coupled to the at least one computing system, the at least one computer-readable media configured to store one or more instructions that in response to being executed by the at least one computing system cause performance of operations, the operations comprising: obtain first audio data during a first communication session that includes a device; select, automatically and independently by the computing system based on a first availability of a plurality of first transcription units, a first transcription unit of the plurality of first transcription units instead of one of a plurality of second transcription units to generate a transcription of the first audio data to direct to the device, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process; in response to selecting the first transcription unit, direct the transcription generated by the first transcription unit to the device; obtain second audio data during a second communication session that includes the device and that is different from the first communication session; and select, automatically and independently by the computing system based on a second availability of the plurality of first transcription units, a second transcription unit of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audi
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
using speech recognition · CPC title
Constructional details of speech recognition systems · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Text-based messaging services in telephone networks such as PSTN/ISDN, e.g. User-to-User Signalling or Short Message Service for fixed networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.