What technology area does this patent fall under?

Primary CPC classification G10L15/26. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Content-based audio playback emphasis

US9454965B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9454965-B2
Application number	US-201514852021-A
Country	US
Kind code	B2
Filing date	Sep 11, 2015
Priority date	Aug 20, 2004
Publication date	Sep 27, 2016
Grant date	Sep 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising: (A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood score in a second computer-readable medium; (B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the relevance score in a third computer-readable medium; (C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying an emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium; and (D) modifying, in accordance with the emphasis factor, the emphasis placed on the region of the spoken audio stream by gradually increasing an emphasis factor applied to at least one word occurring before a first word in the region of the spoken audio stream, producing an emphasis-adjusted audio stream. 2. The method of claim 1 , wherein (C) comprises deriving, from the likelihood and the measure of relevance, a timescale adjustment factor for adjusting a playback rate of the region of the spoken audio stream. 3. The method of claim 1 , wherein (C) comprises deriving, from the likelihood and the measure of relevance, a signal power adjustment factor for adjusting a signal power of the region of the spoken audio stream. 4. The method of claim 1 , further comprising: (E) playing the emphasis-adjusted audio stream. 5. The method of claim 4 , further comprising: (F) correcting errors in the document based on the emphasis-adjusted audio stream. 6. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method comprising: (A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood score in a second computer-readable medium; (B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the relevance score in a third computer-readable medium; (C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying an emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium; and (D) modifying, in accordance with the emphasis factor, the emphasis placed on the region of the spoken audio stream by gradually decreasing the emphasis factor applied to at least one word occurring after a last word in the region of the spoken audio stream, producing an emphasis-adjusted audio stream. 7. The method of claim 6 , wherein (C) comprises deriving, from the likelihood and the measure of relevance, a timescale adjustment factor for adjusting a playback rate of the region of the spoken audio stream. 8. The method of claim 6 , wherein (C) comprises deriving, from the likelihood and the measure of relevance, a signal power adjustment factor for adjusting a signal power of the region of the spoken audio stream. 9. The method of claim 6 , further comprising: (E) playing the emphasis-adjusted audio stream. 10. The method of claim 9 , further comprising: (F) correcting errors in the document based on the emphasis-adjusted audio stream.

Assignees

Mmodal Ip Llc

Inventors

Classifications

G10L15/1807
using prosody or stress · CPC title
G10L21/04
Time compression or expansion · CPC title
G10L15/26Primary
Speech to text systems (G10L15/08 takes precedence) · CPC title
G06F40/232
Orthographic correction, e.g. spell checking or vowelisation · CPC title
G10L15/22Primary
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

View patent family 37718652

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9454965B2 cover?: Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example…
Who is the assignee on this patent?: Mmodal Ip Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).