Transcript-based insertion of secondary video content into primary video content

US11049525B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11049525-B2
Application numberUS-201916281903-A
CountryUS
Kind codeB2
Filing dateFeb 21, 2019
Priority dateFeb 21, 2019
Publication dateJun 29, 2021
Grant dateJun 29, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Certain embodiments involve transcript-based techniques for facilitating insertion of secondary video content into primary video content. For instance, a video editor presents a video editing interface having a primary video section displaying a primary video, a text-based navigation section having navigable portions of a primary video transcript, and a secondary video menu section displaying candidate secondary videos. In some embodiments, candidate secondary videos are obtained by using target terms detected in the transcript to query a remote data source for the candidate secondary videos. In embodiments involving video insertion, the video editor identifies a portion of the primary video corresponding to a portion of the transcript selected within the text-based navigation section. The video editor inserts a secondary video, which is selected from the candidate secondary videos based on an input received at the secondary video menu section, at the identified portion of the primary video.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method in which one or more processing devices performs operations comprising: presenting a video editing interface comprising a primary video section displaying a primary video, a text-based navigation section having selectable portions of a transcript of the primary video that trigger navigation to respective portions of the primary video, and a secondary video menu section displaying candidate secondary videos; selecting a portion of the transcript corresponding to a text-selection input received at the text-based navigation section; identifying a portion of the primary video corresponding to the selected portion of the transcript; detecting target terms in the portion of the transcript; generating a candidate video query based on the detected target terms in the selected portion of the transcript; retrieving candidate secondary videos in response to submitting the candidate video query to one or more data sources; selecting, from the retrieved candidate secondary videos, a secondary video corresponding to a video-selection input received at the secondary video menu section; and inserting the selected secondary video into the primary video to replace the identified portion of the primary video. 2. The method of claim 1 , wherein inserting the selected secondary video into the primary video to replace the identified portion of the primary video comprises: identifying a first time stamp, a second time stamp, and a third time stamp that are associated with playback of the primary video, wherein the second time stamp corresponds to the identified portion of the primary video; and performing a playback operation in the video editing interface, wherein the playback operation comprises: rendering first frames from the primary video for display between the first time stamp of the primary video and the second time stamp of the primary video, determining that the selected secondary video has been selected for insertion into the primary video, rendering frames retrieved from the secondary video for display starting at the second time stamp and continuing for a duration between the second time stamp and the third time stamp, and rendering second frames from the primary video for display starting from the third time stamp of the primary video. 3. The method of claim 2 , wherein performing the playback operation comprises retrieving the first frames and the second frames from a primary video file that includes the primary video and that lacks any content from the selected secondary video. 4. The method of claim 1 , wherein the video-selection input comprises a dragging input that drags a visual representation of the selected secondary video over the selected portion of the transcript in the text-based navigation section. 5. The method of claim 1 , the operations further comprising: applying, in the text-based navigation section, selectable recommendation indicators to the detected target terms; receiving a selection of a particular selectable recommendation indicator of the selectable recommendation indicators, wherein the candidate video query is generated responsive to the selection and includes a query parameter that includes or is derived from a detected particular target term corresponding to the particular selectable recommendation indicator; and displaying selectable visual representations of the retrieved candidate secondary videos. 6. The method of claim 1 , wherein detecting the target terms in the portion of the transcript comprises: accessing a set of words included in the portion of the transcript; computing a set of target term probabilities for the set of words, wherein, for each word of the set of words, a respective target term probability is computed by performing additional operations comprising: generating a frequency feature vector representing a frequency of the word within the transcript, a sentiment feature vector representing sentiments associated with the word within the transcript, and a part-of-speech feature vector representing syntaxes of the word within the transcript, combining the frequency feature vector, the sentiment feature vector, and the part-of-speech feature vector into a target feature vector for the word, and computing the respective target term probability by applying a recommendation machine-learning model to the target feature vector, wherein the recommendation machine-learning model is trained to associate training target feature vectors with training words tagged as secondary video search terms in training transcripts; and selecting, from the set of words included in the portion of the transcript, the target terms having respective target term probabilities that exceed a threshold probability. 7. The method of claim 1 , the operations further comprising obtaining the candidate secondary videos by performing operations; receiving a target term via a search field of the secondary video menu section, wherein the candidate video query further includes a query parameter that includes or is derived from the received target term. 8. The method of claim 1 , wherein, subsequent to insertion of the selected secondary video into the primary video, audio content associated with the identified portion of the primary video is playable with the selected secondary video. 9. A system, comprising: one or more processors; and a memory coupled with the one or more processors, the memory configured to store instructions that when executed by the one or more processors cause the one or more processors to: present a video editing interface comprising a primary video section displaying a primary video, a text-based navigation section having selectable portions of a transcript of the primary video that trigger navigation to respective portions of the primary video, and a secondary video menu section displaying candidate secondary videos; select a portion of the transcript corresponding to a text-selection input received at the text-based navigation section; identify a portion of the primary video corresponding to the selected portion of the transcript; detect target terms in the portion of the transcript; generate a candidate video query based on the detected target terms in the selected portion of the transcript; retrieve candidate secondary videos in response to submitting the candidate video query to one or more data sources; select, from the retrieved candidate secondary videos, a secondary video corresponding to a video-selection input received at the secondary video menu section; and insert the selected secondary video into the primary video to replace the identified portion of the primary video. 10. The system of claim 9 , wherein inserting the selected secondary video into the primary video to replace the identified portion of the primary video comprises: identifying a first time stamp, a second time stamp, and a third time stamp that are associated with playback of the primary video, wherein the second time stamp corresponds to the identified portion of the primary video; and performing a playback operation in the video editing interface, wherein the playback operation comprises: rendering first frames from the primary video for display between the first time stamp of the primary video and the second time stamp of the primary video, determining that the selected secondary video has been selected for insertion into the primary video, rendering frames retrieved from the secondary video for display starting at the second time stamp and continuing for a duration between the second time stamp and the third time stamp, and rendering second frames from the primary video for display starting from the third time stamp

Assignees

Inventors

Classifications

  • G11B27/11Primary

    by using information not detectable on the record carrier · CPC title

  • G11B27/34Primary

    Indicating arrangements  {(indicating means incorporated in magazine or cassette G11B23/046 and G11B23/0875; indicating measured values in general G01D)} · CPC title

  • Electronic editing of digitised analogue information signals, e.g. audio or video signals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11049525B2 cover?
Certain embodiments involve transcript-based techniques for facilitating insertion of secondary video content into primary video content. For instance, a video editor presents a video editing interface having a primary video section displaying a primary video, a text-based navigation section having navigable portions of a primary video transcript, and a secondary video menu section displaying c…
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G11B27/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).