Immersive telepresence anywhere
US-9215406-B2 · Dec 15, 2015 · US
US12400660B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12400660-B2 |
| Application number | US-202318219889-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 10, 2023 |
| Priority date | Feb 28, 2014 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and system for providing captioned telephone service, the method comprising the steps of, initiating a first captioned telephone service call, during the first captioned telephone service call, creating a first set of captions using a call assistant, simultaneous with creating the first set of captions using a call assistant, creating a second set of captions using an automated speech recognition engine, comparing the first set of captions and the second set of captions using a scoring algorithm based on errors between the first and second sets of captions to generate a score for the second set of captions, in response to the score being within a predetermined threshold range, continuing the call using only the automated speech recognition engine to generate text and in response to the score being outside of the predetermined threshold range, continuing the call using a call assistant to generate captions.
Opening claim text (preview).
What is claimed is: 1. A method for providing captioned telephone service, the method comprising the steps of: initiating a first captioned telephone service call; creating, using an automated speech recognition (ASR) engine, a first set of captions based on words spoken by a first party to the first captioned telephone service call; displaying the first set of captions to a second party to the first captioned telephone service call; measuring the accuracy of the first set of captions with respect to the words spoken by the first party; in response to the accuracy of the captions being above a predetermined acceptable accuracy threshold level, continuing to create captions using the ASR engine; and in response to the accuracy of the captions being below the predetermined acceptable accuracy threshold level, providing the words spoken by the first party thereafter to a call assistant for at least a portion of the remainder of the call to generate captions for the at least a portion of the remainder of the captioned telephone service call; and displaying the captions generated by the call assistant to the second party during the at least a portion of the remainder of the call. 2. The method of claim 1 wherein measuring the accuracy of the first set of captions with respect to the words spoken by the first party further comprises creating, using a call assistant, a second set of captions based on the words spoken by the first party and comparing the first set of captions and the second set of captions to identify errors between the two caption sets. 3. The method of claim 1 wherein the second set of captions is considered true and any difference between the first set of captions and the second set of captions is recognized as an error in the first set of captions. 4. The method of claim 3 wherein, as errors are recognized in the first set of captions, the errors in the captions that have been displayed to the second user are corrected. 5. The method of claim 4 wherein errors in the displayed captions are corrected in line within the displayed captions. 6. The method of claim 5 wherein in line error corrections are visually distinguished from other captions when displayed. 7. The method of claim 1 in which the automated speech recognition engine is selected from a plurality of automated speech recognition engines and wherein the measured accuracy of the first set of captions created by the ASR engine is stored and correlated with the engine in a database. 8. The method of claim 1 further comprising: initiating a second captioned telephone service call; selecting an ASR engine for the second captioned telephone service call; creating, using the selected ASR engine, captions based on words spoken by the first party; and displaying the captions created using the selected ASR engine to a second party to the second captioned telephone service call. 9. The method of claim 1 further including, while the first set of captions are displayed, presenting an accuracy indicator indicating current accuracy of the first set of captions. 10. The method of claim 9 wherein the accuracy indicator includes a % accuracy of the first set of captions over a rolling prior period of time. 11. The method of claim 9 wherein the first set of captions and the accuracy indicator are presented on a display screen, the method further including presenting an option on the display screen to switch from the ASR engine generating the first set of captions to a human call assistant generating captions presented on the display screen. 12. The method of claim 1 further including broadcasting the words spoken by the first party via a speaker to the second party. 13. The method of claim 12 wherein, while the captions generated by the human call assistant for the remainder of the call are displayed to the second party, the step of broadcasting the words spoken is delayed a predetermined duration so that the broadcast words are more aligned in time with the displayed captions generated by the human call assistant. 14. The method of claim 12 wherein, while the captions created by the human call assistant for the remainder of the call are displayed to the second party, the step of broadcasting the words spoken is delayed a duration of time wherein the duration of time is controlled as a function of the accuracy of the first set of captions over a prior rolling period of time wherein the duration of time is increased when the accuracy of the first set of captions over the prior rolling period of time has been low and is decreased when the accuracy of the first set of captions over the prior rolling period of time has been high. 15. The method of claim 1 further including displaying a caption resource indication to the second party indicating one of that the displayed captions are being generated via an ASR and that the displayed captions are being generated via a call assistant. 16. The method of claim 1 wherein the second party communicates via an assisted user's (AU's) communication device including a processor that runs the ASR engine to generate the first set of captions and wherein the call assistant generated captions are generated at a location remote from the AU's communication device. 17. A captioning system for providing captioned telephone service to an assisted user (AU) during an ongoing call with a hearing user (HU), the system comprising: at least a first processor programmed to perform the steps of: establishing a first captioned telephone service; running an automated speech recognition engine to generate a first set of captions based on the HU voice signal; displaying the first set of captions to the AU; measuring the accuracy of the first set of captions with respect to words spoken by the HU; in response to the accuracy of the first set of captions being below a predetermined acceptable accuracy threshold level, obtaining captions generated by a call assistant for words spoken after the accuracy drops below the accuracy threshold level for at least a portion of the remainder of the call; and displaying the captions generated by the human call assistant to the AU during the at least a portion of the remainder of the call. 18. A method for providing a communication service, the method comprising the steps of: initiating a first communication session between an assisted user (AU) and a hearing user (HU); creating a first set of captions based on words spoken by the HU using a first captioning process including a call assistant (CA); simultaneously with creating the first set of captions using the first captioning process, creating a second set of captions using a second captioning process wherein HU voice signal is transcribed into the second set of captions using an automated speech recognition (ASR) engine; comparing the first set of captions and the second set of captions to assess accuracy of the second set of captions; in response to the second set of captions being accurate above a predetermined threshold level, continuing the first communication session using only the second captioning process; and in response to the second set of captions being less accurate than the predetermined threshold level, continuing the first communication session using the first captioning process. 19. The method of claim 18 wherein the step of comparing is performed substantially in real time during the first communication session. 20. The method of claim 18 wherein the step of comparing includes assessing a percent of the s
Assessment or evaluation of speech recognition systems · CPC title
for measuring the quality of voice signals · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Language aspects · CPC title
Medium conversion · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.