Hierarchical segmentation of screen captured, screencasted, or streamed video

US11893794B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11893794-B2
Application numberUS-202217805080-A
CountryUS
Kind codeB2
Filing dateJun 2, 2022
Priority dateSep 10, 2020
Publication dateFeb 6, 2024
Grant dateFeb 6, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are directed to segmentation and hierarchical clustering of video. In an example implementation, a video is ingested to generate a multi-level hierarchical segmentation of the video. In some embodiments, the finest level identifies a smallest interaction unit of the video—semantically defined video segments of unequal duration called clip atoms. Clip atom boundaries are detected in various ways. For example, speech boundaries are detected from audio of the video, and scene boundaries are detected from video frames of the video. The detected boundaries are used to define the clip atoms, which are hierarchically clustered to form a multi-level hierarchical representation of the video. In some cases, the hierarchical segmentation identifies a static, pre-computed, hierarchical set of video segments, where each level of the hierarchical segmentation identifies a complete set (i.e., covering the entire range of the video) of disjoint (i.e., non-overlapping) video segments with a corresponding level of granularity.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: extracting, from a software usage log associated with a video, event boundaries of log events associated with screen capturing, screencasting, or livestreaming the video; generating a representation of a hierarchical segmentation of a video timeline of the video using the event boundaries extracted from the software usage log; and providing at least one level of the hierarchical segmentation of the video timeline for presentation. 2. The method of claim 1 , the software usage log generated by creative software during the screen capturing or the screencasting of the video. 3. The method of claim 1 , wherein the video is a tutorial for creative software. 4. The method of claim 1 , the software usage log generated by a video game during screencasting of the video game. 5. The method of claim 1 , wherein the event boundaries are extracted from a log of visual events detected from video frames of the video. 6. The method of claim 1 , wherein the software usage log represents interactions between one or more users viewing the video and the video. 7. The method of claim 1 , wherein the event boundaries are extracted from a chat stream of chat messages associated with the livestream of the video. 8. The method of claim 1 , wherein generating the representation of the hierarchical segmentation of the video timeline comprises forming a level of the hierarchical segmentation by computing an optimal segmentation of the video timeline using a cost function that quantifies cut cost for a candidate segmentation based on different types of cut costs for different types of boundaries in the candidate segmentation. 9. One or more non-transitory computer-readable storage media containing instructions which, when executed by one or more processors, cause the one or more processors to perform operations comprising: extracting, from a software usage log associated with a video, event boundaries of log events associated with screen capturing, screencasting, or livestreaming the video; generating a representation of a hierarchical segmentation of a video timeline of the video using the event boundaries extracted from the software usage log; and providing at least one level of the hierarchical segmentation of the video timeline for presentation. 10. The one or more non-transitory computer-readable storage media of claim 9 , the software usage log generated by creative software during the screen capturing or the screencasting of the video. 11. The one or more non-transitory computer-readable storage media of claim 9 , wherein the video is a tutorial for creative software. 12. The one or more non-transitory computer-readable storage media of claim 9 , the software usage log generated by a video game during the screencasting of the video game. 13. The one or more non-transitory computer-readable storage media of claim 9 , wherein the event boundaries are extracted from a log of visual events detected from video frames of the video. 14. The one or more non-transitory computer-readable storage media of claim 9 , wherein the software usage log represents interactions between one or more users viewing the video and the video. 15. The one or more non-transitory computer-readable storage media of claim 9 , wherein the event boundaries are extracted from a chat stream of chat messages associated with the livestream of the video. 16. The one or more non-transitory computer-readable storage media of claim 9 , wherein generating the representation of the hierarchical segmentation of the video timeline comprises forming a level of the hierarchical segmentation by computing an optimal segmentation of the video timeline using a cost function that quantifies cut cost for a candidate segmentation based on different types of cut costs for different types of boundaries in the candidate segmentation. 17. A computing system, comprising: one or more processors; and one or more non-transitory computer-readable storage media containing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations comprising: extracting, from a software usage log associated with a video, event boundaries of log events associated with screen capturing, screencasting, or livestreaming the video; generating a representation of a hierarchical segmentation of a video timeline of the video using the event boundaries extracted from the software usage log; and providing at least one level of the hierarchical segmentation of the video timeline for presentation. 18. The computing system of claim 17 , the software usage log generated by creative software during the screen capturing or the screencasting of the video. 19. The computing system of claim 17 , wherein the event boundaries are extracted from a log of visual events detected from video frames of the video. 20. The computing system of claim 17 , wherein the software usage log represents interactions between one or more users viewing the video and the video.

Assignees

Inventors

Classifications

  • G06V20/49Primary

    Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title

  • Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram · CPC title

  • Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items (segmenting video sequences G06V20/49) · CPC title

  • Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title

  • Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M9/10) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11893794B2 cover?
Embodiments are directed to segmentation and hierarchical clustering of video. In an example implementation, a video is ingested to generate a multi-level hierarchical segmentation of the video. In some embodiments, the finest level identifies a smallest interaction unit of the video—semantically defined video segments of unequal duration called clip atoms. Clip atom boundaries are detected in …
Who is the assignee on this patent?
Adobe Inc
What technology area does this patent fall under?
Primary CPC classification G06V20/49. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).