Switching between speech recognition systems

US11935540B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11935540-B2
Application numberUS-202117450030-A
CountryUS
Kind codeB2
Filing dateOct 5, 2021
Priority dateDec 4, 2018
Publication dateMar 19, 2024
Grant dateMar 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription units, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcript of the first audio data. The method may also include obtaining revoiced audio generated by a revoicing of the first audio data by a captioning assistant and generating a transcription of the revoiced audio using an automatic speech recognition system. The method may further include in response to selecting the revoiced transcription unit, directing the transcription of the revoiced audio to the second device as the transcript of the first audio data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising obtaining, at a system, first audio data during a first communication session that includes a device and a second device; selecting, automatically and independently by the system based on a first number of a plurality of first transcription units that are available, one of the plurality of first transcription units instead of one of a plurality of second transcription units to generate a transcription of the first audio data to direct to the device, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process; obtaining, by the system, second audio data during a second communication session that includes the device and that is different from the first communication session; and selecting, automatically and independently by the system based on one or more features of the second communication session, the one or more features including the second device being associated with a business and a second number of the plurality of first transcription units that are available, one of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audio data to direct to the device. 2. The method of claim 1 , further comprising directing the transcription generated by the selected transcription unit to the device. 3. The method of claim 2 , wherein the one or more features include a preference of a user of the second device. 4. The method of claim 2 , wherein the one or more features include estimated accuracy of the plurality of second transcription units during one or more previous communication sessions between the device and the second device. 5. The method of claim 1 , wherein the availability of the second number of the plurality of first transcription units indicates that the second number of the plurality of first transcription units that are idle and available to generate transcriptions of audio is below or estimated to be below a threshold. 6. The method of claim 1 , wherein the one or more features include an estimated accuracy of the plurality of second transcription units for the second communication session. 7. The method of claim 1 , wherein the second number is less than the first number. 8. The method of claim 1 , further comprising: obtaining, by the system, third audio data during a third communication session that includes the device and that is different from the first communication session and the second communication session; and selecting, automatically and independently by the system based on a third number of the plurality of first transcription units that are available, one of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audio data to direct to the device, wherein the third number is less than the first number and the second number. 9. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 10. A method comprising: obtaining, at a system, first audio data during a communication session; and selecting, automatically and independently by the system based on one or more features of the communication session and an availability of a plurality of first transcription units, a transcription unit of a plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the first audio data, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process, wherein the availability of the plurality of first transcription units indicates that a number of the plurality of first transcription units that are idle and available to generate transcriptions of audio is below or estimated to be below a threshold. 11. The method of claim 10 , wherein the communication session includes a first device and a second device, the method further comprising directing the transcription generated by the selected transcription unit to the first device. 12. The method of claim 11 , wherein the one or more features include a preference of a user of the second device. 13. The method of claim 11 , wherein the one or more features include estimated accuracy of the plurality of second transcription units during one or more previous communication sessions between the first device and the second device. 14. The method of claim 11 , wherein the one or more features include the second device being associated with a business. 15. The method of claim 10 , wherein the one or more features include an estimated accuracy of the plurality of second transcription units for the communication session. 16. The method of claim 10 , further comprising: obtaining, by the system, second audio data during a second communication session that is different from the communication session; and selecting, automatically and independently by the system based on a second number of the plurality of first transcription units that are available, one of the plurality of first transcription units instead of one of the plurality of second transcription units to generate a transcription of the second audio data. 17. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 10 . 18. A system comprising: at least one computing system; and at least one computer-readable media coupled to the at least one computing system, the at least one computer-readable media configured to store one or more instructions that in response to being executed by the at least one computing system cause performance of operations, the operations comprising: obtain first audio data during a communication session; and select, automatically and independently based on one or more features of the communication session and an availability of a plurality of first transcription units , a transcription unit of a plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the first audio data, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process, wherein the availability of the plurality of first transcription units indicates that a number of the plurality of first transcription units that are idle and available to generate transcriptions of audio is below or estimated to be below a threshold. 19. The system of claim 18 , wherein the one or more features include an estimated accuracy of the plurality of second transcription units for the communication session. 20. The system of claim 18 , wherein the operations further comprise: obtain second audio data during a second communication session that is different from the communication session; and select, automatically and independently based on a second number of the plurality of first transcript

Assignees

Inventors

Classifications

  • G10L15/32Primary

    Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Speech to text systems (G10L15/08 takes precedence) · CPC title

  • G10L15/28Primary

    Constructional details of speech recognition systems · CPC title

  • Text-based messaging services in telephone networks such as PSTN/ISDN, e.g. User-to-User Signalling or Short Message Service for fixed networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11935540B2 cover?
A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription unit…
Who is the assignee on this patent?
Sorenson Ip Holdings Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/32. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).