What technology area does this patent fall under?

Primary CPC classification G10L15/32. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Switching between speech recognition systems

US11145312B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11145312-B2
Application number	US-202016847200-A
Country	US
Kind code	B2
Filing date	Apr 13, 2020
Priority date	Dec 4, 2018
Publication date	Oct 12, 2021
Grant date	Oct 12, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription units, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcript of the first audio data. The method may also include obtaining revoiced audio generated by a revoicing of the first audio data by a captioning assistant and generating a transcription of the revoiced audio using an automatic speech recognition system. The method may further include in response to selecting the revoiced transcription unit, directing the transcription of the revoiced audio to the second device as the transcript of the first audio data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising obtaining, at a system, first audio data during a first communication session that includes a device; selecting, automatically and independently by the system based on a first availability of revoiced transcription units with respect to the first communication session, a revoiced transcription unit instead of a non-revoiced transcription unit to generate a transcription of the first audio data that is directed to the device; obtaining, by the system, revoiced audio generated by a revoicing of the first audio data; generating, by the system, a transcription of the revoiced audio using an automatic speech recognition system of the revoiced transcription unit; directing, by the system, the transcription of the revoiced audio to the device; obtaining, by the system, second audio data during a second communication session that includes the device and that is different from the first communication session; and selecting, automatically and independently by the system based on a second availability of the revoiced transcription units with respect to the second communication session, the non-revoiced transcription unit instead of one of the revoiced transcription units to generate a transcription of the second audio data to direct to the device such that none of the revoiced transcription units generate a transcription of the second audio data, the second availability of the revoiced transcription units indicating that at least one revoiced transcription unit is available. 2. The method of claim 1 , wherein the first availability of revoiced transcription units is based on one or more of: a current peak number of transcriptions being generated, a current average number of transcriptions being generated, a projected peak number of transcriptions to be generated, a projected average number of transcriptions to be generated, a projected number of the revoiced transcription units, and a number of available revoiced transcription units. 3. The method of claim 1 , wherein in response to the selection of the revoiced transcription unit the non-selected non-revoiced transcription unit does not generate a transcription of the first audio data. 4. The method of claim 1 , wherein the automatic speech recognition system is trained specifically for a speaker used to generate the revoiced audio. 5. The method of claim 4 , wherein a second automatic speech recognition system used by the non-revoiced transcription unit to generate the transcription of the second audio data is trained for a plurality of speakers. 6. The method of claim 1 , At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 7. A method comprising: obtaining, at a system, first audio data during a first communication session that includes a device; selecting, automatically and independently by the system based on a first availability of a plurality of first transcription units, a first transcription unit of the plurality of first transcription units instead of one of a plurality of second transcription units to generate a transcription of the first audio data to direct to the device, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process; in response to selecting the first transcription unit, directing, by the system, the transcription generated by the first transcription unit to the device; obtaining, by the system, second audio data during a second communication session that includes the device and that is different from the first communication session; and selecting, automatically and independently by the system based on a second availability of the plurality of first transcription units, a second transcription unit of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audio data to direct to the device, the second availability of the plurality of first transcription units indicating that at least one of the plurality of first transcription units is estimated to be available for at least a portion of the second communication session. 8. The method of claim 7 , wherein the first availability of the plurality of first transcription units is based on one or more of: a current peak number of transcriptions being generated, a current average number of transcriptions being generated, a projected peak number of transcriptions to be generated, a projected average number of transcriptions to be generated, a projected number of the plurality of first transcription units, and a number of available units of the plurality of first transcription units. 9. The method of claim 7 , wherein the plurality of first transcription units are revoiced transcription units. 10. The method of claim 9 , wherein the plurality of second transcription units are non-revoiced transcription units. 11. The method of claim 9 , further comprising obtaining revoiced audio generated by a revoicing of the first audio data, wherein the selected first transcription unit uses the revoiced audio and an automatic speech recognition system to generate the transcription of the first audio data. 12. The method of claim 11 , wherein the automatic speech recognition system is trained specifically for a speaker used to generate the revoiced audio. 13. The method of claim 12 , wherein a second automatic speech recognition system used by one or more of the plurality of second transcription units is trained for a plurality of speakers. 14. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 7 . 15. A system comprising: at least one computing system; and at least one computer-readable media coupled to the at least one computing system, the at least one computer-readable media configured to store one or more instructions that in response to being executed by the at least one computing system cause performance of operations, the operations comprising: obtain first audio data during a first communication session that includes a device; select, automatically and independently by the computing system based on a first availability of a plurality of first transcription units, a first transcription unit of the plurality of first transcription units instead of one of a plurality of second transcription units to generate a transcription of the first audio data to direct to the device, wherein the plurality of first transcription units use a first process to generate transcripts and the plurality of second transcription units use a second process to generate transcripts that is different than the first process; in response to selecting the first transcription unit, direct the transcription generated by the first transcription unit to the device; obtain second audio data during a second communication session that includes the device and that is different from the first communication session; and select, automatically and independently by the computing system based on a second availability of the plurality of first transcription units, a second transcription unit of the plurality of second transcription units instead of one of the plurality of first transcription units to generate a transcription of the second audi

Assignees

Sorenson Ip Holdings Llc

Inventors

Classifications

G10L15/32Primary
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
H04M2201/40
using speech recognition · CPC title
G10L15/28Primary
Constructional details of speech recognition systems · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
H04M3/42382
Text-based messaging services in telephone networks such as PSTN/ISDN, e.g. User-to-User Signalling or Short Message Service for fixed networks · CPC title

Patent family

Related publications grouped by family.

View patent family 69570809

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11145312B2 cover?: A method may include obtaining first audio data originating at a first device during a communication session between the first device and a second device. The method may also include obtaining an availability of revoiced transcription units in a transcription system and in response to establishment of the communication session, selecting, based on the availability of revoiced transcription unit…
Who is the assignee on this patent?: Sorenson Ip Holdings Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/32. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).