Computerized system and method for automatically generating high-quality digital content thumbnails from digital video

US9972360B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9972360-B2
Application numberUS-201615250990-A
CountryUS
Kind codeB2
Filing dateAug 30, 2016
Priority dateAug 30, 2016
Publication dateMay 15, 2018
Grant dateMay 15, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are systems and methods for improving interactions with and between computers in content searching, generating, hosting and/or providing systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interactions between or among processors in such systems. The disclosed systems and methods automatically generate a thumbnail image from a frame of a video file, where the thumbnail image displays content of a selected frame determined to be high-quality and highly-relevant to the content of the video file. Frames of a video file are analyzed, and the frame that is the most contextually relevant to the video and of the highest visual quality is selected, where a thumbnail image is generated and displayed on a site or application over a network.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying, via a computing device, a video, said video comprising a plurality of video frames each displaying an image; in response to identifying said video, automatically, via the computing device, performing frame filtering on the video by parsing, via the computing device, each of said video frames and identifying, based on said parsing, content of each image of each frame, said frame filtering further comprising determining, based on the content of each frame, a type of each frame, said frame filtering comprising identifying a first set of frames from said plurality of video frames based on said type of each frame; analyzing, via the computing device, said first set of frames by applying de-duplication software, said application comprising the computing device executing said de-duplication software, and based on said execution of the de-duplication software, determining duplicate frames within said first set of frames, said analysis further comprising, based on said duplicate frame determination, discarding said duplicate frames from said first set; computing, via the computing device, a stillness value for the remaining frames in the first set, said stillness value comprising an indication of motion energy present within content of the frames remaining in the first set; identifying, via the computing device, a second set of frames by extracting, via the computing device, keyframes from the remaining frames in said first set based on said stillness value, said extracted keyframes having a stillness value satisfying a threshold for motion energy, said extracted keyframes being a non-redundant subset of the video frames; determining, via the computing device, a relevance value for each frame in the second set, said relevance value determination comprising analyzing, via the computing device, aesthetic features of each frame in the second set and determining, based on said analysis, an aesthetic value for each frame, said determination further comprising clustering the frames in the second set based on a statistical gap between aesthetic values of the frames, and identifying a third frame set by selecting a frame from each cluster that has a highest aesthetic score; determining, via the computing device, a quality value of the frames in the third set based on a ranking of the aesthetic scores in the third set; and generating, via the computing device, a thumbnail image by selecting a frame in the third set having the highest aesthetic score, said thumbnail image comprising content of said selected frame. 2. The method of claim 1 , further comprising: automatically displaying said generated thumbnail image on a page as a representation of the video. 3. The method of claim 1 , wherein said frame filtering further comprises: identifying frames within said plurality that have a type associated with at least one of a low-quality frame or a transition frame; and generating said first set of frames while excluding said low-quality and transition frames, wherein said first set of frames comprises all of the video frames except those identified as low-quality or transition frames. 4. The method of claim 3 , wherein said low-quality frames comprise content that is dark, blurry or uniform-colored. 5. The method of claim 4 , wherein said dark content is determined by computing a relative luminance, said luminance computation comprising: Luminance( I rgb )=0.2126 Ir+ 0.7152 I g +0.0722 I b , wherein rgb refers to RGB color space, and wherein said computed luminance is subject to a thresholding analysis based on an empirically selected value. 6. The method of claim 4 , wherein said blurry content is determined by computing a sharpness value, said sharpness computation comprising: Sharpness( I gray )=√((Δ x I gray) 2 +(Δ y I gray) 2 )  (Eq. 2). 7. The method of claim 4 , wherein said uniform-colored content is determined by frame filtering steps comprising: computing a normalized intensity histogram for image content of a frame resulting in values of intensity; sorting said values in descending order; computing a cumulative distribution at top percentage bins; and thresholding said cumulative distribution based on an empirically selected value. 8. The method of claim 3 , wherein said transition frames are determined based on said computing device executing software defined by a shot boundary detection algorithm which determines transition frames from its input. 9. The method of claim 1 , wherein said second set of frames is further based on an analysis comprising: determining a set of shots from said first set of frames by applying software defined by a k-means algorithm on said first set of frames; clustering the frames from the first set based on said k-means determination; and analyzing said clustered frames and identifying a continuous block of frames within a single cluster; wherein said stillness value is determined in accordance with said continuous block of frames. 10. The method of claim 1 , wherein said quality value determination is based on the stillness value of each frame in the third set. 11. The method of claim 1 , wherein said quality value determination further comprises: extracting said aesthetic features of each frame in the third set; and applying software defined by a random forest regression model and determining a quality score for each frame, wherein said selected frame has a highest quality score. 12. The method of claim 1 , further comprising: determining a context of the thumbnail image, said context being in accordance with said content of said selected frame; causing communication, over the network, of said context to an advertisement platform to obtain digital advertisement content associated with said context; and displaying a digital content item comprising said digital advertisement content in accordance with said thumbnail image. 13. A non-transitory computer-readable storage medium tangibly encoded with computer-executable instructions, that when executed by a processor associated with a computing device, performs a method comprising: identifying, via the computing device, a video, said video comprising a plurality of video frames each displaying an image; in response to identifying said video, automatically, via the computing device, performing frame filtering on the video by parsing, via the computing device, each of said video frames and identifying, based on said parsing, content of each image of each frame, said frame filtering further comprising determining, based on the content of each frame, a type of each frame, said frame filtering comprising identifying a first set of frames from said plurality of video frames based on said type of each frame; analyzing, via the computing device, said first set of frames by applying de-duplication software, said application comprising the computing device executing said de-duplication software, and based on said execution of the de-duplication software, determining duplicate frames within said first set of frames, said analysis further comprising, based on said duplicate frame determination, discarding said duplicate frames from said first set; computing, via the computing device, a stillness value for the remaining frames in the first set, said stillness value comprising an indication of motion energy present within content of the frames remaining in the first set; identifying, via the computing device, a second set of frames by extracting, via the computing device, keyframes from the remaining frames in said first set based on said stillness value, said extracted keyframes having a still

Assignees

Inventors

Classifications

  • G06V20/47Primary

    Detecting features for summarising video content · CPC title

  • G11B27/34Primary

    Indicating arrangements  {(indicating means incorporated in magazine or cassette G11B23/046 and G11B23/0875; indicating measured values in general G01D)} · CPC title

  • Evaluation of the quality of the acquired pattern · CPC title

  • for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window · CPC title

  • Video hosting of uploaded data from client · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9972360B2 cover?
Disclosed are systems and methods for improving interactions with and between computers in content searching, generating, hosting and/or providing systems supported by or configured with personal computing devices, servers and/or platforms. The systems interact to identify and retrieve data within or across platforms, which can be used to improve the quality of data used in processing interacti…
Who is the assignee on this patent?
Oath Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/47. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).