Method for generating video synopsis through scene understanding and system therefor

US11620335B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11620335-B2
Application numberUS-202016995835-A
CountryUS
Kind codeB2
Filing dateAug 18, 2020
Priority dateSep 17, 2019
Publication dateApr 4, 2023
Grant dateApr 4, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments relate to a method for generating a video synopsis including receiving a user query; performing an object based analysis of a source video; and generating a synopsis video in response to a video synopsis generation request from a user, and a system therefor. The video synopsis generated by the embodiments reflects the user's desired interaction.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for generating a video synopsis, comprising: at least one processor; and a non-transitory memory storing instructions which, when executed by the at least one processor, cause the at least one processor to: detect at least one source object in a source video including at least one object; detect one or more motion of the source object in the source video; generate one or more source object tube including the source object on which the motion is detected; determine an interaction associated with the source object of the tube through an interaction determination model pre-learned to determine interactions associated with an object included in an input image, the interaction determination model comprising a convolution network; and generate one or more video synopsis based on the source object tube associated with the determined interaction. 2. The system for generating a video synopsis according to claim 1 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to detect the source object through an object detection model, and the object detection model is pre-learned to extract one or more feature for detecting an object from an input image and determine a class corresponding to the object included in the input image. 3. The system for generating a video synopsis according to claim 2 , wherein the object detection model is configured to detect the source object by extracting the feature for detecting the object from the input image and determining the class to which each pixel belongs. 4. The system for generating a video synopsis according to claim 3 , wherein the object detection model includes: a first submodel to determine a region of interest (ROI) by detecting a position of the object in the input image; and a second submodel to mask the object included in the ROI. 5. The system for generating a video synopsis according to claim 1 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: extract a background from the source video. 6. The system for generating a video synopsis according to claim 5 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: extract the background through a background detection model that determines at least one class regarded as the background for each pixel. 7. The system for generating a video synopsis according to claim 5 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: extract the background by cutting a region occupied by the source object in the source video. 8. The system for generating a video synopsis according to claim 5 , further comprising: a background database (DB) to store the extracted background of the source video. 9. The system for generating a video synopsis according to claim 1 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: compute tracking information of the source object by tracking a specific object in a subset of frames in which the specific source object is detected. 10. The system for generating a video synopsis according to claim 9 , wherein tracking information of the source object includes at least one of whether moving or not, a velocity, a speed, and a direction. 11. The system for generating a video synopsis according to claim 1 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: generate the source object tube based on a subset of frames representing an activity of the source object, or a combination of the subset of frames representing the activity of the source object and a subset of frames representing the source object. 12. The system for generating a video synopsis according to claim 11 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: filter a region including the source object in the frame of the source video, and generate the source object tube in which at least part of background of the source video is removed. 13. The system for generating a video synopsis according to claim 1 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to: when a image region including the source object is extracted as a result of the detection of the source object, generate the source object tube using the extracted image region instead of the filtering. 14. The system for generating a video synopsis according to claim 1 , wherein the interaction determination model is learned to preset an associable interaction class with the object, wherein the said object is a subject of an action that triggers the interaction. 15. The system for generating a video synopsis according to claim 14 , wherein the interaction determination model is further configured to: receive image having a size including a first object as the input image and extract a first feature, receive image having a size including a region including the first object and a different region as the input image and extract a second feature, and determine interaction associated with the first object based on the first feature and the second feature. 16. The system for generating a video synopsis according to claim 15 , wherein the different region includes a region including a second object that is different from the first object, or a background. 17. The system for generating a video synopsis according to claim 15 , wherein the first feature is a feature extracted to detect the source object. 18. The system for generating a video synopsis according to claim 1 , wherein the interaction determination model is configured to determine a class of the interaction by detecting an activity of a specific object which is a subject of an action triggering the interaction and detecting a different element associated with the interaction. 19. The system for generating a video synopsis according to claim 18 , wherein the interaction determination model includes: an activity detection network to detect the activity of the specific object in the input image; and an object detection network to detect an object that is different from the activity object by extracting a feature from the input image, the activity detection network is configured to extract the feature for determining a class of the activity appearing in the video, and the object detection network is configured to extract the feature for determining a class of the object appearing in the video. 20. The system for generating a video synopsis according to claim 19 , wherein the feature for determining the class of the activity includes a pose feature, and the feature for determining the class of the object includes an appearance feature. 21. The system for generating a video synopsis according to claim 19 , wherein the interaction determination model is further configured to: link a set of values computed by the activity detection network and a set of values computed by the object detection network to generate an interaction matrix, and determine the interaction associated with the specific object in the input image based on the activity and the object corresponding to a row and a column of an element having a highest value among elements of the interaction matrix.

Assignees

Inventors

Classifications

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • Detecting features for summarising video content · CPC title

  • Creating video summaries, e.g. movie trailer {(retrieval in video databases by using presentations in form of a video summary G06F16/739)} · CPC title

  • Classification techniques · CPC title

  • using classification, e.g. of video objects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11620335B2 cover?
Embodiments relate to a method for generating a video synopsis including receiving a user query; performing an object based analysis of a source video; and generating a synopsis video in response to a video synopsis generation request from a user, and a system therefor. The video synopsis generated by the embodiments reflects the user's desired interaction.
Who is the assignee on this patent?
Korea Inst Sci & Tech
What technology area does this patent fall under?
Primary CPC classification H04N21/8549. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).