Correcting transcribed audio files with an email-client interface

US9715876B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9715876-B2
Application numberUS-201414158311-A
CountryUS
Kind codeB2
Filing dateJan 17, 2014
Priority dateApr 17, 2006
Publication dateJul 25, 2017
Grant dateJul 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for requesting a transcription of audio data. One method includes displaying a send-for-transcription button within an email-client interface on a computer-controlled display, and automatically sending a selected email message and associated audio data to a transcription server as a request for a transcription of the associated audio data when a user selects the send-for-transcription button.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a selection of one of a plurality of email messages delivered to an email-client responsive to a user interaction within the email-client; displaying a send-for-transcription selection mechanism within the email-client on a computer-controlled display, the send-for-transcription selection mechanism associated with a voice-independent model; wherein the voice-independent model is trained based on a plurality of corrected text data sets of transcription requests for audio data representing speech of more than one speaker of a plurality of speakers; responsive to an activation of the send-for-transcription selection mechanism, transmitting a communication identifying the selected email message; wherein the communication is transmitted to a remote system that is associated with the voice-independent model and is to ascertain whether to segment an audio file corresponding to the communication based on a comparison of a duration associated with the audio file to a threshold, generate a transcription of a subset of segments of a segmentation of the audio file based on the voice-independent model and train the voice-independent model based on a corrected text data set for the subset of the segments; and responsive to transmission of the communication, receive transcription data associated with a transcription corresponding to a remainder of the segments, wherein the transcription corresponding to the remainder of the segments is determined based on the voice independent model as trained based on the corrected text data set. 2. The method of claim 1 , further comprising displaying a status of the selected email message within the email-client, wherein the status indicates at least one of whether the selected email message has been sent to a transcription server of the remote system, whether transcribed text based on the audio file has been received, or whether corrected text data associated with the transcribed text has been received. 3. The method of claim 2 , further comprising playing the audio file within the email-client. 4. The method of claim 1 , further comprising receiving the transcription corresponding to the remainder of the segments from the remote system. 5. The method of claim 4 , further comprising displaying the transcription data within the email-client. 6. The method of claim 5 , further comprising receiving corrected text data associated with the transcription corresponding to the remainder of the segments. 7. The method of claim 6 , further comprising sending the corrected text data to the remote system. 8. A method, comprising: receiving a selected one of a plurality of email messages associated with an email-client; correlating the received email message to an account of a plurality of accounts; obtaining stored account settings of the correlated account; ascertaining whether to segment an audio file corresponding to the received email message based on a comparison of a duration associated with the audio file to a threshold; in response to an ascertainment to segment the audio file, segmenting the audio file into segments and generating a transcription of a subset of the segments based on the account settings and a voice-independent model trained based on a plurality of corrected text data sets of transcription requests for audio data representing speech of more than one speaker of a plurality of speakers; and additionally training the voice-independent model based on a corrected text data set for the subset of the segments; generating a transcription corresponding to a remainder of the segments based on the additionally trained voice independent model; and electronically transmitting a communication over an electronic network to cause transcription data associated with the transcription corresponding to the remainder of the segments to be delivered to the email-client as a response to an activation of a send-for-transcription selection mechanism of the email-client. 9. The method of claim 8 , wherein the account settings include at least one of transcribed text delivery settings, transcription settings, or transcription format settings. 10. The method of claim 8 , wherein correlating the received email message to the account of the plurality of accounts further comprises identifying the account based on metadata taken from the received email message. 11. The method of claim 10 , wherein the metadata comprises at least one of an email address or an internet protocol address associated with the received email message. 12. The method of claim 8 , wherein correlating the received email message to the account of the plurality of accounts further comprises identifying the account based on information included in the body of the email message. 13. A memory device having instructions stored thereon that, in response to execution by a processing device, cause the processing device to perform operations comprising: in response to receiving a communication, correlating the communication to an account of a plurality of accounts; wherein the communication includes a selected one of a plurality of email messages associated with an email-client; and obtaining stored account settings of the correlated account; ascertaining whether to segment an audio file corresponding to the communication based on a comparison of a duration associated with the audio file to a threshold; in response to an ascertainment to segment the audio file, segmenting the audio file into segments and generating a transcription of a subset of the segments based on the account settings and a voice-independent model trained based on a plurality of corrected text data sets of transcription requests for audio data representing speech of more than one speaker of a plurality of speakers; additionally training the voice-independent model based on a corrected text data set for the subset of the segments; generating a transcription corresponding to a remainder of the segments based on the additionally trained voice independent model; and electronically transmitting a communication over an electronic network to cause transcription data associated with the transcription corresponding to the remainder of the segments to be delivered to the email-client as a response to an activation of a send-for-transcription selection mechanism of the email-client. 14. The memory device of claim 13 , wherein the account settings include at least one of transcribed text delivery settings, transcription settings, or transcription format settings. 15. The memory device of claim 13 , wherein correlating the communication to the account further comprises identifying the account based on metadata taken from the selected email message. 16. The memory device of claim 15 , wherein the metadata comprises at least one of an email address or an internet protocol address associated with the selected email message. 17. The memory device of claim 13 , wherein correlating the communication to the account further comprises identifying the account based on information included in the body of the selected email message. 18. An apparatus, comprising: means for identifying an account of a plurality of accounts in response to receiving a communication; wherein the communication includes a selected one of a plurality of email messages associated with an email-client; means for obtaining stored account settings of the identified account; means for segmenting an audio file corresponding to the communication responsive to a comparison of a duration associated

Assignees

Inventors

Classifications

  • Computer-aided management of electronic mailing [e-mailing] · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Text processing (natural language analysis G06F40/20; semantic analysis G06F40/30; processing or translation of natural language G06F40/40) · CPC title

  • Format adaptation, e.g. format conversion or compression · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9715876B2 cover?
Methods and systems for requesting a transcription of audio data. One method includes displaying a send-for-transcription button within an email-client interface on a computer-controlled display, and automatically sending a selected email message and associated audio data to a transcription server as a request for a transcription of the associated audio data when a user selects the send-for-tra…
Who is the assignee on this patent?
Iii Holdings 1 Llc
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).