Optimizing insertion points for content based on audio and video characteristics

US12413830B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12413830-B2
Application numberUS-202218041794-A
CountryUS
Kind codeB2
Filing dateDec 13, 2022
Priority dateDec 13, 2022
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of the present invention sets forth a technique for inserting content into a media program. The technique includes determining a plurality of markers corresponding to a plurality of locations within a media program. The technique also includes for each marker included in the plurality of markers, automatically analyzing a first set of intervals within the media program that lead up to the marker and a second set of intervals within the media program that immediately follow the marker and determine a set of audio characteristics associated with the first set of intervals and the second set of intervals. The technique further includes generating a plurality of scores for the plurality of markers based on the set of audio characteristics for each marker and inserting additional content at one or more markers included in the plurality of markers based on the plurality of scores.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for inserting additional content into a media program, the method comprising: determining a plurality of markers corresponding to a plurality of locations within the media program; for each marker included in the plurality of markers, performing one or more operations to automatically analyze a first set of intervals within the media program that lead up to the marker and a second set of intervals within the media program that immediately follow the marker and determine a set of characteristics associated with the first set of intervals and the second set of intervals; generating a plurality of scores for the plurality of markers based on the sets of characteristics; and inserting the additional content at one or more markers included in the plurality of markers based on the plurality of scores; wherein generating the plurality of scores comprises computing a score for a first marker included in the plurality of markers based on a first characteristic determined for a first interval within the media program that leads up to the first marker and a second characteristic determined for a second interval within the media program that immediately follows the first marker. 2. The computer-implemented method of claim 1 , further comprising determining one or more marker treatments associated with the one or more markers based on the sets of characteristics determined for the one or more markers, wherein the one or more marker treatments comprise at least one of a fade-in or a fade-out. 3. The computer-implemented method of claim 1 , wherein generating the plurality of scores comprises computing an overall score for the first marker included in the plurality of markers based on a weighted combination of a subset of the plurality of scores determined based on the set of characteristics for the first marker. 4. The computer-implemented method of claim 1 , further comprising: generating a ranking for the plurality of markers based on the plurality of scores; and determining the one or more markers based on the ranking. 5. The computer-implemented method of claim 1 , wherein the first and second characteristics determined for the first marker included in the plurality of markers comprises at least one of an audio loudness level or a presence or absence of speech. 6. The computer-implemented method of claim 1 , wherein the first and second characteristics determined for the first marker included in the plurality of markers comprises a first set of audio spectrum characteristics determined for the first interval leading up to the first marker and a second set of audio spectrum characteristics determined for the second interval immediately following the first marker. 7. The computer-implemented method of claim 6 , wherein the plurality of scores comprises one or more scores representing one or more similarities between the first set of audio spectrum characteristics and the second set of audio spectrum characteristics. 8. The computer-implemented method of claim 1 , wherein the set of characteristics include a set of video characteristics, and wherein generating the plurality of scores for the plurality of markers is based on the set of video characteristics determined for one or more portions of the media program proximate to each marker included in the plurality of markers. 9. The computer-implemented method of claim 8 , wherein the set of video characteristics comprises at least one of a luminance level, a measurement of dispersion in scenes, a spatial activity, a temporal activity, or a semantic characteristic. 10. One or more non-transitory computer-readable media storing instructions that, when executed by one or more processors, cause the one or more processors to perform: determining a plurality of markers corresponding to a plurality of locations within a media program; for each marker included in the plurality of markers, performing one or more operations to automatically analyze a first set of intervals within the media program that lead up to the marker and a second set of intervals within the media program that immediately follow the marker and determine a set of characteristics associated with the first set of intervals and the second set of intervals; generating a plurality of scores for the plurality of markers based on the sets of characteristics; and inserting additional content at one or more markers included in the plurality of markers based on the plurality of scores; wherein generating the plurality of scores comprises computing a score for a first marker included in the plurality of markers based on a first characteristic determined for a first interval within the media program that leads up to the first marker and a second characteristic determined for a second interval within the media program that immediately follows the first marker. 11. The one or more non-transitory computer-readable media of claim 10 , wherein generating the plurality of scores comprises: inputting the first and second characteristics for the first marker included in the plurality of markers into a machine learning model; and wherein computing the score for the first marker is an output of the machine learning model. 12. The one or more non-transitory computer-readable media of claim 10 , wherein performing one or more operations to automatically analyze the first set of intervals and the second set of intervals comprises: determining a first set of audio spectrum characteristics for the first interval leading up to the first marker and a second set of audio spectrum characteristics for the second interval immediately following the first marker; and computing a set of pairwise similarities between the first set of spectrum characteristics and the second set of spectrum characteristics. 13. The one or more non-transitory computer-readable media of claim 12 , wherein the first set of audio spectrum characteristics and the second set of audio spectrum characteristics are determined based on a window size that is shorter than the first interval and the second interval. 14. The one or more non-transitory computer-readable media of claim 12 , wherein generating the plurality of scores comprises an aggregation of the set of pairwise similarities. 15. The one or more non-transitory computer-readable media of claim 12 , wherein the first set of audio spectrum characteristics and the second set of audio spectrum characteristics comprise at least one of a mel-scale spectrogram, a mel-frequency ceptrum coefficient, or a tempogram. 16. The one or more non-transitory computer-readable media of claim 10 , wherein the first and second characteristics determined for the first marker included in the plurality of markers comprises a first audio loudness level associated with the first interval leading up to the first marker, a second audio loudness level associated with the second interval immediately following the first marker, a first detection of speech within a third interval leading up to the first marker, and a second detection of speech within a fourth interval immediately following the first marker. 17. The one or more non-transitory computer-readable media of claim 10 , wherein the first and second characteristics determined for the first marker included in the plurality of markers comprises an indication of semantic continuity across the first marker. 18. The one or more non-transitory computer-readable media of claim 10 , wherein the set of characteristics include a set of video characteristics, and wherein the generating t

Assignees

Inventors

Classifications

  • characterized by learning algorithms · CPC title

  • involving splicing one content stream with another content stream, e.g. for substituting a video clip · CPC title

  • involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams (arrangements characterised by components specially adapted for monitoring, identification or recognition of audio in broadcast systems H04H60/58) · CPC title

  • involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream (arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title

  • Processing of audio elementary streams {(monitoring, identification or recognition of audio in broadcast systems H04H60/58)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12413830B2 cover?
One embodiment of the present invention sets forth a technique for inserting content into a media program. The technique includes determining a plurality of markers corresponding to a plurality of locations within a media program. The technique also includes for each marker included in the plurality of markers, automatically analyzing a first set of intervals within the media program that lead …
Who is the assignee on this patent?
Disney Entpr Inc, Beijing Hulu Software Tech Development Co Ltd, Beijing Yojaja Software Tech Development Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N21/44008. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).