Publishing a disparate live media output stream manifest that includes one or more media segments corresponding to key events
US-2024340474-A1 · Oct 10, 2024 · US
US12266178B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12266178-B2 |
| Application number | US-202117334971-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 31, 2021 |
| Priority date | Nov 6, 2020 |
| Publication date | Apr 1, 2025 |
| Grant date | Apr 1, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, devices, and a non-transitory computer-readable storage medium are provided for determining a video cover image. The method includes: obtaining a candidate image set containing a plurality of image frames to be processed by determining the plurality of image frames to be processed from a video to be processed, each of the plurality of image frames to be processed containing at least one target object; obtaining a target score of each of the plurality of image frames to be processed by inputting the plurality of image frames to be processed in the candidate image set into a scoring network; and sorting the target scores of the plurality of image frames to be processed according to a set order to obtain a sorting result, and determining a video cover image of the video to be processed from the plurality of image frames to be processed according to the sorting result.
Opening claim text (preview).
What is claimed is: 1. A method for determining a video cover image, comprising: obtaining a candidate image set containing a plurality of image frames to be processed by determining the plurality of image frames to be processed from a video to be processed, each of the plurality of image frames to be processed containing at least one target object; inputting each of the plurality of image frames to be processed into an image scoring network to obtain a black edge size of each of the plurality of image frames to be processed, a brightness score of each of the plurality of image frames to be processed, and a definition score of each of the plurality of image frames to be processed, wherein the image scoring network is obtained based on neural network training; weighting the black edge size of each of the plurality of image frames to be processed, the brightness score of each of the plurality of image frames to be processed, and the definition score of each of the plurality of image frames to be processed; summing the weighted black edge size of a picture where each of the plurality of image frames to be processed is located, the weighted brightness score of each of the plurality of image frames to be processed, and the weighted definition score of each of the plurality of image frames to be processed to obtain an image feature score of each of the plurality of image frames to be processed, wherein image features of an image frame to be processed comprise: a black edge, a brightness, and a definition, and the black edge is a black part except picture content in the image frame to be processed; inputting each of the plurality of image frames to be processed into an object scoring network to obtain a number of person images in each of the plurality of image frames to be processed, a location of a person image in the image frame to be processed, a size of the person image, a definition score of the person image, an eye state score of a person in the person image, an expression score of the person in the person image, and a pose score of the person in the person image, wherein the object scoring network is obtained based on neural network training; obtaining an object feature score of each of the plurality of image frames to be processed based on the number of the person images in each of the plurality of image frames to be processed, the location of the person image in the image frame to be processed, the size of the person image, the definition score of the person image, the eye state score of the person in the person image, the expression score of the person in the person image, and the pose score of the person in the person image; inputting each of the plurality of image frames to be processed into an aesthetic scoring network to obtain a composition score of each of the image frames to be processed and a color richness score of each of the plurality of image frames to be processed, wherein the aesthetic scoring network is obtained based on neural network training; obtaining an aesthetic feature score of each of the plurality of image frames to be processed based on the composition score of each of the plurality of image frames to be processed and the color richness score of each of the plurality of image frames to be processed; obtaining the target score of each of the plurality of image frames to be processed based on the image feature score, the object feature score, and the aesthetic feature score; and sorting a plurality of target scores of the plurality of image frames to be processed according to a set order to obtain a sorting result, and determining the video cover image of the video to be processed from the plurality of image frames to be processed according to the sorting result. 2. The method of claim 1 , wherein obtaining the target score of each of the plurality of image frames to be processed based on the image feature score, the object feature score, and the aesthetic feature score comprises: obtaining a weighted image feature score, a weighted object feature score, and a weighted aesthetic feature score by respectively weighting the image feature score, the object feature score, and the aesthetic feature score of each of the plurality of image frames to be processed; and obtaining the target score of each of the plurality of image frames to be processed by summing the weighted image feature score, the weighted object feature score, and the weighted aesthetic feature score. 3. The method of claim 1 , wherein a number of the plurality of image frames is M, M being a positive integer; and wherein the method further comprises: obtaining N image frames by performing frame extraction on the video to be processed according to a set time interval, wherein determining the plurality of image frames to be processed from the video to be processed comprises: determining the plurality of image frames to be processed containing the target object from the N image frames, N being a positive integer greater than or equal to M. 4. The method of claim 1 , further comprising: extracting one or more image frames from the plurality of image frames to be processed by performing matching based on a filtering rule according to a filtering model, wherein the one or more image frames extracted are not matched with image frames to be filtered contained in the filtering rule, and wherein obtaining the target score of each of the plurality of image frames to be processed by inputting the plurality of image frames to be processed in the candidate image set into the scoring network comprises: obtaining the target score of each of the one or more image frames that are not matched with the image frames to be filtered contained in the filtering rule and extracted from the plurality of image frames to be processed by inputting the image frame into the scoring network. 5. A device, comprising: a processor; and a memory configured to store instructions executable by the processor, wherein the processor is configured to execute the instructions to: obtain a candidate image set containing a plurality of image frames to be processed by determining the plurality of image frames to be processed from a video to be processed, each of the plurality of image frames to be processed containing at least one target object; input each of the plurality of image frames to be processed into an image scoring network to obtain a black edge size of each of the plurality of image frames to be processed, a brightness score of each of the plurality of image frames to be processed, and a definition score of each of the plurality of image frames to be processed, wherein the image scoring network is obtained based on neural network training; weight the black edge size of each of the plurality of image frames to be processed, the brightness score of each of the plurality of image frames to be processed, and the definition score of each of the plurality of image frames to be processed; sum the weighted black edge size of a picture where each of the plurality of image frames to be processed is located, the weighted brightness score of each of the plurality of image frames to be processed, and the weighted definition score of each of the plurality of image frames to be processed to obtain an image feature score of each of the plurality of image frames to be processed, wherein image features of an image frame to be processed comprise: a black edge, a brightness, and a definition, and the black edge is a black part except picture content in the image frame to be processed; input each of the plurality of image frames to be processed into an object scoring network to obtain a number of person images in each of the plurality of image frames to be processed, a location of a person image in the image frame to be processed, a size of the person image, a defi
Matching criteria, e.g. proximity measures · CPC title
in form of a video summary, e.g. the video summary being a video sequence, a composite still image or having synthesized frames · CPC title
Blocking scenes or portions of the received content, e.g. censoring scenes · CPC title
involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream (arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title
involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.