Aligning body matter across content formats

US10109278B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10109278-B2
Application numberUS-201213604486-A
CountryUS
Kind codeB2
Filing dateSep 5, 2012
Priority dateAug 2, 2012
Publication dateOct 23, 2018
Grant dateOct 23, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A content alignment service is described that may generate content synchronization information to facilitate the synchronous presentation of corresponding audio content and textual content. In some embodiments, portions of body text (as opposed to front matter, such as a table of contents; or back matter, such as an index) in the textual content are identified and synchronized with corresponding audio content. In one example application, an audiobook may be synchronized with an electronic book. As the body text portions of the electronic book are consumed, corresponding words of the audiobook may be audibly presented.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for aligning content, the system comprising: an electronic data store configured to store: an electronic book comprising: a plurality of paragraphs of body text, and matter other than body text, wherein the matter other than body text comprises text within at least front matter and back matter; and an audiobook that is a companion to the electronic book; and a physical computing device in communication with the electronic data store, the physical computing device configured to: generate a textual transcription of the audiobook by applying a speech-to-text recognition routine on the audiobook; identify a portion of the textual transcription that includes text also included in a paragraph of the electronic book; determine a level of correlation between words in the paragraph of the electronic book and words in the portion of the textual transcription; determine that the level of correlation satisfies a threshold value; in response to determining that the level of correlation satisfies the threshold value, identify the paragraph of the electronic book as body text; identify a first portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription; determine that the first portion of the electronic book that does not satisfy the threshold value is front matter based at least in part on a determination that the first portion of the electronic book that does not satisfy the threshold value appears within the electronic book prior to an earliest portion of the electronic book for which a corresponding portion of the audiobook is identified; identify a second portion of the electronic book that does not satisfy the threshold value with respect to the textual transcription; determine that the second portion of the electronic book that does not satisfy the threshold value is back matter based at least in part on a determination that the second portion of the electronic book that does not satisfy the threshold value appears within the electronic book after a last portion of the electronic book for which a corresponding portion of the audiobook is identified; and generate content synchronization information that identifies (a) portions of the audiobook that correspond to the paragraphs of the body text and (b) further identifies the matter other than body text in the electronic book, wherein the content synchronization information indicates that the matter other than body text in the electronic book, including the first portion and second portion of the electronic book, does not correspond to any portion of the audiobook, wherein the content synchronization information indicates that the paragraph, excluding the matter other than body text, should be presented in synchronization with a portion of the audiobook from which the corresponding portion of the textual transcription was generated. 2. The system of claim 1 , wherein the physical computing device is further configured to provide the content synchronization information to a separate computing device. 3. The system of claim 1 , wherein the physical computing device is further configured to synchronously present the paragraph of the electronic book and the portion of the audiobook from which the corresponding portion of the textual transcription was generated. 4. A computer-implemented method for aligning content, the computer-implemented method comprising: as implemented by one or more computing devices configured with specific computer-executable instructions, obtaining a textual transcription of an item of content comprising audio content; identifying a portion of the textual transcription that includes text also included in a portion of a companion item of textual content, wherein the textual content includes body text and matter other than body text; determining a level of correlation between words in the portion of the companion item of textual content and words in the portion of the textual transcription; determining that the level of correlation satisfies a threshold value; in response to determining that the level of correlation satisfies a threshold value, identifying the portion of the companion item of textual content as including body text; identifying a second portion of the companion item of textual content that does not satisfy the threshold value with respect to any portion of the textual transcription; determining that the second portion of the companion item of textual content that does not satisfy the threshold value is front matter based at least in part on a determination that the second portion of the companion item of textual content that does not satisfy the threshold value appears within the companion item of textual content prior to an earliest portion of the companion item of textual content for which a corresponding portion of the audio content is identified; and generating content synchronization information that indicates (a) portions of the audio content that correspond to body text of the companion item of textual content and (b) further indicates that the matter other than body text in the textual content does not correspond to any portion of the audio content, wherein the matter other than body text includes the second portion of the companion item of textual content determined to be front matter, wherein the content synchronization information indicates that the body text included in the portion of the companion item of textual content should be presented in synchronization with a portion of the audio content that corresponds to the body text included in the portion of the textual transcription. 5. The computer-implemented method of claim 4 , wherein obtaining the textual transcription comprises generating the textual transcription from the audio content. 6. The computer-implemented method of claim 4 , wherein determining the level of correlation between words in the portion of the companion item of textual content and words in the portion of the textual transcription comprises computing a correlation measure for a block of the companion item of textual content with respect to the textual transcription, the block comprising one or more portions of the companion item of textual content. 7. The computer-implemented method of claim 4 , wherein the body text portion comprises at least one of a word, a phrase, a sentence, a paragraph, and a line of dialogue. 8. The computer-implemented method of claim 4 , wherein the companion item of textual content is an electronic book. 9. The computer-implemented method of claim 4 , wherein the item of content comprising audio content is an audiobook. 10. The computer-implemented method of claim 4 , wherein the item of content comprising audio content further comprises video content. 11. A system for aligning content, the system comprising: an electronic data store configured to store: a transcription of an item of content comprising audio content; and a companion item of textual content, wherein the companion item of textual content comprises: a plurality of paragraphs of body text, and matter other than body text; and a physical computing device in communication with the electronic data store, the physical computing device configured to: identify, in the transcription, a portion of the transcription that includes text also included in a portion of the companion item of textual content; determine a level of correlation between words in the portion of the companion item of textual content and words in the portion of the transcription; determine that the level of correlation satisfies a threshold value; in response to determining that the level of correlation sati

Assignees

Inventors

Classifications

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Multimedia presentations, e.g. slide shows, multimedia albums · CPC title

  • G10L15/183Primary

    using context dependencies, e.g. language models · CPC title

  • Physics · mapped topic

  • Electricity · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10109278B2 cover?
A content alignment service is described that may generate content synchronization information to facilitate the synchronous presentation of corresponding audio content and textual content. In some embodiments, portions of body text (as opposed to front matter, such as a table of contents; or back matter, such as an index) in the textual content are identified and synchronized with correspondin…
Who is the assignee on this patent?
Dzik Steven C, Story Jr Guy A, Audible Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 23 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).