Intelligent automated content summary generation

US12470784B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12470784-B2
Application numberUS-202218569040-A
CountryUS
Kind codeB2
Filing dateJun 23, 2022
Priority dateJun 28, 2021
Publication dateNov 11, 2025
Grant dateNov 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A content summarizing program processes audio and/or visual content of a media item to determine a plurality of segments of the content; and determines a respective ranking for each of the plurality of segments. The respective ranking indicates how representative of the content as a whole the segment is. Based on the plurality of respective rankings and summary time criteria, the content summarizing program selects summary segments from the plurality of segments; and, based on the summary segments, generates a segment transition. The segment transition corresponds to a pair of adjacent summary segments and comprises characteristics of both first and second summary segments of the pair of adjacent summary segments. The content summarizing program stitches together the summary segments using the segment transition to generate a content summary. The summary segments are stitched together with the segment transition disposed between the first summary segment and the second summary segment.

First claim

Opening claim text (preview).

That which is claimed: 1 . An apparatus comprising at least one processor and at least one memory storing computer executable instructions, the at least one memory and computer executable instructions configured to, with the at least one processor, cause the apparatus to at least: provide a media item comprising audio and/or visual content in a computer-readable format as an input to a content summarizing program, wherein at least one of (a) the content summarizing program comprises at least one neural network or (b) the content summarizing program is configured to call at least one neural network; operate the content summarizing program to: process the audio and/or visual content to determine a plurality of segments of the audio and/or visual content; determine a plurality of respective rankings, each respective ranking corresponding to one segment of the plurality of segments, wherein the respective ranking indicates how representative of the audio and/or visual content as a whole the segment is, and wherein determining the plurality of respective rankings comprises: identifying unique segments that are dissimilar to all other segments based on a similarity threshold; identifying one or more groups of repeated segments, wherein two or more segments of a group of repeated segments are substantially similar to one another based on the similarity threshold; and ranking each group of the one or more groups of repeated segments based at least in part on a number of repeated segments within the group, such that a first group of repeated segments including a greater number of repeated segments than a second group of repeated segments is ranked higher than the second group of repeated segments, wherein a first repeated segment of the first group of repeated segments is not substantially similar to a second repeated segment of a second group of repeated segments; based on the plurality of respective rankings and summary time criteria, select summary segments from the plurality of segments; based on the summary segments, generate a segment transition, wherein the segment transition corresponds to a pair of adjacent summary segments and comprises characteristics of both a first summary segment of the pair of adjacent summary segments and a second summary segment of the pair of adjacent summary segments; and stitch together the summary segments using the segment transition to generate a content summary, wherein the summary segments are stitched together such that the segment transition is disposed between the first summary segment and the second summary segment, the plurality of segments are determined to include stitching margins, a trailing stitching margin of a first segment of the plurality of segments overlaps, at least in part, a leading stitching margin of a second segment of the plurality of segments, the second segment follows the first segment in the audio and/or visual content, and generating the segment transition comprises adjusting a rhythm signature of at least one of the trailing stitching margin of the first summary segment or of the leading stitching margin of the second summary segment; and receive the content summary as output from the content summarizing program. 2 . The apparatus of claim 1 , wherein the audio and/or visual content has a temporal length and each segment of the plurality of segments corresponds to a respective portion of the temporal length. 3 . The apparatus of claim 1 , wherein the summary time criteria comprise at least one of (a) a minimum time length of the content summary or (b) a maximum time length of the content summary. 4 . The apparatus of claim 1 , wherein the first summary segment is not substantially similar to any other of the summary segments based on the similarity threshold. 5 . The apparatus of claim 1 , wherein the content summarizing program comprises a long short term memory (LSTM) network configured to receive the audio and/or visual content at an input layer thereof, process the audio and/or visual content to determine the plurality of segments, and provide information identifying the plurality of segments at an output layer thereof. 6 . The apparatus of claim 1 , wherein the content summarizing program comprises a generative adversarial network configured to generate the segment transition based on at least a portion of the first summary segment and at least a portion of the second summary segment. 7 . The apparatus of claim 1 , wherein the content summarizing program comprises a ranking network configured to determine the plurality of respective rankings based on processing the plurality of segments. 8 . The apparatus of claim 7 , wherein one of (a) the ranking network is configured to receive a genre associated with the audio and/or visual content as input or (b) is selected from one or more ranking networks by the content summarizing program based on the genre associated with the audio and/or visual content. 9 . The apparatus of claim 8 , wherein the genre associated with the audio and/or visual content is determined by (a) analyzing the audio and/or visual content with a genre identification network or (b) reading the genre from meta data associated with the audio and/or visual content. 10 . The apparatus of claim 7 , wherein one of (a) the ranking network is configured to receive user profile information corresponding to an intended audience of the content summary as input or (b) is selected from one or more ranking networks by the content summarizing program based on the user profile information corresponding to the intended audience of the content summary. 11 . The apparatus of claim 1 , wherein the at least one memory and the computer executable instructions are further configured to, with the at least one processor, cause the apparatus to at least one of (a) store the content summary in the at least one memory in an audio and/or visual file format, (b) transmit the content summary such that the content summary is received by at least one of a user computing entity or a system computing entity, or (c) cause a user interface of a computing entity to provide the content summary in a human perceivable format. 12 . The apparatus of claim 1 , wherein adjusting the rhythm signature of the trailing stitching margin of the first summary segment comprises synchronizing at least one of a beat or rhythm signature of the audio and/or visual content within the trailing stitching margin with the at least one of the beat or rhythm signature of the audio and/or visual content in the leading stitching margin and generating the segment transition further comprises reducing an intensity of the trailing stitching margin while increasing an intensity of the leading stitching margin. 13 . A method for automated generation and provision of a content summary for a media item, the method comprising: providing, by one or more processors, the media item comprising audio and/or visual content in a computer-readable format as an input to a content summarizing program, wherein at least one of (a) the content summarizing program comprises at least one neural network or (b) the content summarizing program is configured to call at least one neural network; operating, by the one or more processors, the content summarizing program to: process the audio and/or visual content to determine a plurality of segments of the audio and/or visual content; determine a plurality of respective rankings, each respective ranking corresponding to one segment of the plurality of segments, wherein the respective ranking indicates how representative of the audio and/or visual content as a whole the segment is, and wherein det

Assignees

Inventors

Classifications

  • by decomposing the content in the time domain, e.g. in time segments · CPC title

  • using neural networks, e.g. processing the feedback provided by the user · CPC title

  • involving end-user characteristics, e.g. viewer profile, preferences (monitoring of user activities for profile generation for accessing a video database G06F16/739; user profiles in network data switching protocols H04L67/306; processing of user preferences or user profiles in wireless networks H04W8/18) · CPC title

  • Activation functions · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12470784B2 cover?
A content summarizing program processes audio and/or visual content of a media item to determine a plurality of segments of the content; and determines a respective ranking for each of the plurality of segments. The respective ranking indicates how representative of the content as a whole the segment is. Based on the plurality of respective rankings and summary time criteria, the content summar…
Who is the assignee on this patent?
Univ Georgia State Res Found
What technology area does this patent fall under?
Primary CPC classification H04N21/8549. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Nov 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).