System for automatic video reframing
US-11184558-B1 · Nov 23, 2021 · US
US2022207851A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022207851-A1 |
| Application number | US-202217698309-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 18, 2022 |
| Priority date | Dec 28, 2020 |
| Publication date | Jun 30, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and a method for an automatic video reconstruction to improve scene quality using a dynamic point of interest by finding a point or line of interest are provided. The method includes dividing a first video into a plurality of first frames; determining a first object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the first object of interest; and reconstructing the first video into a second video based on the plurality of second frames.
Opening claim text (preview).
1 . A method of automatically generating video reconstruction, the method comprising: dividing a first video into a plurality of first frames; determining a first object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the first object of interest; and reconstructing the first video into a second video based on the plurality of second frames. 2 . The method according to claim 1 , wherein the dividing the first video comprises: dividing the first video into a plurality of scenes based on images included in the first video or a text externally input, and wherein the determining the first object of interest comprises: detecting a second object included in the plurality of scenes and tracking the second object; and classifying a foreground and a background in the plurality of scenes, and determining the second object as the first object of interest based on a result of the classifying. 3 . The method according to claim 2 , wherein the dividing the first video into the plurality of scenes comprises: detecting voices included in the plurality of first frames through automatic speech recognition (ASR), and converting the voices into text; dividing the images included in the plurality of first frames based on at least one of a color, a shape, or a gradation of each of the images; and generating a feature vector for each of the converted text and the divided images, and dividing the first video into the plurality of scenes based on the feature vector. 4 . The method according to claim 1 , wherein the determining the first object of interest comprises: determining the first object of interest based on an intent recognition and an entity recognition. 5 . The method according to claim 1 , wherein the converting the plurality of first frames comprises: extracting at least one of a point of interest or a line of interest for a third object included in a first frame of the plurality of first frames; and cutting the third object included in the first frame or reconstructing the first frame based on the at least one of the point of interest or the line of interest. 6 . The method according to claim 5 , wherein the reconstructing the first frame comprises: fitting a template to the first frame, the template including five points and three straight lines; and moving the template such that the point of interest or the line of interest is adjacent to or coincides with the five points or the three straight lines. 7 . The method according to claim 1 , wherein the converting the plurality of first frames comprises: removing a partial region of a first frame of the plurality of first frames; generating a second frame of the plurality of second frames by painting a missing area resulted from removal of the partial region; and arranging adjacent second frames by applying in-painting and flow estimation to the plurality of second frames. 8 . A system for automatically generating video reconstruction, the system comprising: a display configured to output a first video, and output a second video in which the first video is reconstructed; and a processor configured to process data for the first video and reconstruct the second video, wherein the processor is further configured to divide the first video into a plurality of first frames, determine a first object of interest from the plurality of first frames, and divide the plurality of first frames into a plurality of second frames based on the first object of interest, and reconstruct the first video into the second video based on the plurality of second frames. 9 . The system according to claim 8 , wherein the processor is further configured to divide the first video into a plurality of scenes based on images included in the first video or a text externally input; detect a second object included in the plurality of scenes and tracking the second object; and classify a foreground and a background in the plurality of scenes, and determining the second object as the first object of interest based on a result of classification. 10 . The system according to claim 9 , wherein the processor is further configured to detect voices included in the plurality of first frames through automatic speech recognition (ASR), and converting the voices into text, divide the images included in the plurality of first frames based on at least one of a color, a shape, or a gradation of each of the images; and generate a feature vector for each of the converted text and the divided images, and divide the first video into the plurality of scenes based on the feature vector. 11 . The system according to claim 8 , wherein the processor is further configured to determine the first object of interest based on an intent recognition and an entity recognition. 12 . The system according to claim 8 , wherein the processor is further configured to extract at least one of a point of interest or a line of interest for a third object included in a first frame of the plurality of first frames; and cut the third object included in the first frame or reconstructing the first frame based on the at least one of the point of interest or the line of interest. 13 . The system according to claim 12 , wherein the processor is further configured to fit a template to the first frame, the template including five points and three straight lines; and move the template such that the point of interest or the line of interest is adjacent to or coincides with the five points or the three straight lines. 14 . The system according to claim 8 , wherein the processor is further configured to remove a partial region of a first frame of the plurality of first frames, generate a second frame of the plurality of second frames by painting a missing area resulted from removal of the partial region; and arrange adjacent second frames by applying in-painting and flow estimation to the plurality of second frames. 15 . A computer program product comprising a non-transitory computer-readable medium storing instructions that, when executed by at least one hardware processor, cause the at least one hardware processor to perform operations comprising: dividing a first video into a plurality of first frames; determining an object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the object of interest; and reconstructing the first video into a second video based on the plurality of second frames.
Recognition using electronic means · CPC title
Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title
Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title
based on user input or interaction · CPC title
Surveillance or monitoring of activities, e.g. for recognising suspicious objects (recognising microscopic objects G06V20/69) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.