Transcription of communications

US11600279B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11600279-B2
Application numberUS-201917279512-A
CountryUS
Kind codeB2
Filing dateAug 26, 2019
Priority dateOct 8, 2018
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method to transcribe communications may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to an automated speech recognition system configured to transcribe the audio data. The method may further include obtaining multiple hypothesis transcriptions generated by the automated speech recognition system. Each of the multiple hypothesis transcriptions may include one or more words determined by the automated speech recognition system to be a transcription of a portion of the audio data. The method may further include determining one or more consistent words that are included in two or more of the multiple hypothesis transcriptions and in response to determining the one or more consistent words, providing the one or more consistent words to the second device for presentation of the one or more consistent words by the second device.

First claim

Opening claim text (preview).

We claim: 1. A method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data; determining a plurality of consistent words that are included in two or more of the plurality of hypothesis transcriptions; in response to determining the plurality of consistent words, directing the plurality of consistent words to a device for presentation of the plurality of consistent words; and presenting the plurality of consistent words in a rolling fashion, a pace of the presentation of the plurality of consistent words in the rolling fashion being variable such that when more words are to be presented the words are presented quicker than when fewer words are to be presented. 2. The method of claim 1 , further comprising: determining an update word in a final transcription of the audio data that is different from any of the plurality of consistent words; and in response to determining the update word, directing an indication of the update word to the device, the update word replacing one or more of the plurality of consistent words in the presentation of the plurality of consistent words. 3. The method of claim 1 , wherein the presentation of the plurality of consistent words by the device is configured to occur before a final transcription of the audio data is provided to the device. 4. The method of claim 1 , wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech. 5. At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 1 . 6. A method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data; determining one or more consistent words that are included in two or more of the plurality of hypothesis transcriptions; in response to determining the one or more consistent words, directing the one or more consistent words to a device for presentation of the one or more consistent words; determining an update word in a final transcription of the audio data that is different from any of the one or more consistent words; and in response to determining the update word and after directing the one or more consistent words to the device, directing an indication of the update word to the device, the update word replacing one or more of the one or more consistent words in the presentation of the one or more consistent words on the device. 7. The method of claim 6 , further comprising obtaining the audio data during a communication session between a second device and the device, the audio data originating at the second device. 8. The method of claim 6 , wherein the presentation of the one or more consistent words by the device is configured to occur before the final transcription of the audio data is provided to the device. 9. The method of claim 6 , wherein the one or more consistent words are presented in a rolling fashion, a pace of the presentation of the one or more consistent words in the rolling fashion being variable. 10. The method of claim 6 , wherein the plurality of hypothesis transcriptions are obtained sequentially over time and determining the one or more consistent words includes comparing a first hypothesis transcription of the plurality of hypothesis transcriptions with a second hypothesis transcription of the plurality of hypothesis transcriptions, the second hypothesis transcription directly following the first hypothesis transcription among the plurality of hypothesis transcriptions. 11. The method of claim 6 , wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech. 12. At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 6 . 13. A method to transcribe communications, the method comprising: obtaining a plurality of hypothesis transcriptions of audio data, each of the plurality of hypothesis transcriptions including one or more words determined to be a transcription of portions of the audio data; determining one or more consistent words that are included in two or more of the plurality of hypothesis transcriptions, each of the two or more of the plurality of hypothesis transcriptions including words from a first portion of the audio data; in response to determining the one or more consistent words, providing the one or more consistent words to a device for presentation of the one or more consistent words by the device, the presentation of the one or more consistent words configured to occur before a final transcription of the audio data is provided to the device; and presenting the one or more consistent words in a rolling fashion, a pace of the presentation of the one or more consistent words in the rolling fashion being variable such that when more words are to be presented the words are presented quicker than when fewer words are to be presented. 14. The method of claim 13 , wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a plurality of speech engines configured to recognize speech. 15. The method of claim 13 , wherein the plurality of hypothesis transcriptions are obtained from a speech recognition system that includes a single speech engine configured to recognize speech. 16. The method of claim 13 , wherein a first portion of the audio data associated with a first one of the plurality of hypothesis transcriptions includes all of the audio data associated with at least one of the plurality of hypothesis transcriptions obtained previous to obtaining the first one of the plurality of hypothesis transcriptions. 17. The method of claim 13 , wherein the plurality of hypothesis transcriptions are obtained sequentially over time and the determining the one or more consistent words includes comparing a first hypothesis transcription of the plurality of hypothesis transcriptions with a second hypothesis transcription of the plurality of hypothesis transcriptions, the second hypothesis transcription directly following the first hypothesis transcription among the plurality of hypothesis transcriptions. 18. The method of claim 13 , further comprising: determining an update word in the final transcription that is different from any of the one or more consistent words; and in response to determining the update word and after directing the one or more consistent words to the device, directing an indication of the update word to the device, the update word replacing one or more of the one or more consistent words in the presentation of the one or more consistent words on the device. 19. At least one non-transitory computer-readable media configured to store one or more instructions that when executed by at least one processor cause or direct a system to perform the method of claim 13 .

Assignees

Inventors

Classifications

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Announcement of recognition results · CPC title

  • Speech classification or search · CPC title

  • using speech recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11600279B2 cover?
A method to transcribe communications may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to an automated speech recognition system configured to transcribe the audio data. The method may further include obtaining multiple hypothesis transcriptions generated by the automated speec…
Who is the assignee on this patent?
Sorenson Ip Holdings Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).