System and method for automatic video reconstruction with dynamic point of interest

US2022207851A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022207851-A1
Application numberUS-202217698309-A
CountryUS
Kind codeA1
Filing dateMar 18, 2022
Priority dateDec 28, 2020
Publication dateJun 30, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and a method for an automatic video reconstruction to improve scene quality using a dynamic point of interest by finding a point or line of interest are provided. The method includes dividing a first video into a plurality of first frames; determining a first object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the first object of interest; and reconstructing the first video into a second video based on the plurality of second frames.

First claim

Opening claim text (preview).

1 . A method of automatically generating video reconstruction, the method comprising: dividing a first video into a plurality of first frames; determining a first object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the first object of interest; and reconstructing the first video into a second video based on the plurality of second frames. 2 . The method according to claim 1 , wherein the dividing the first video comprises: dividing the first video into a plurality of scenes based on images included in the first video or a text externally input, and wherein the determining the first object of interest comprises: detecting a second object included in the plurality of scenes and tracking the second object; and classifying a foreground and a background in the plurality of scenes, and determining the second object as the first object of interest based on a result of the classifying. 3 . The method according to claim 2 , wherein the dividing the first video into the plurality of scenes comprises: detecting voices included in the plurality of first frames through automatic speech recognition (ASR), and converting the voices into text; dividing the images included in the plurality of first frames based on at least one of a color, a shape, or a gradation of each of the images; and generating a feature vector for each of the converted text and the divided images, and dividing the first video into the plurality of scenes based on the feature vector. 4 . The method according to claim 1 , wherein the determining the first object of interest comprises: determining the first object of interest based on an intent recognition and an entity recognition. 5 . The method according to claim 1 , wherein the converting the plurality of first frames comprises: extracting at least one of a point of interest or a line of interest for a third object included in a first frame of the plurality of first frames; and cutting the third object included in the first frame or reconstructing the first frame based on the at least one of the point of interest or the line of interest. 6 . The method according to claim 5 , wherein the reconstructing the first frame comprises: fitting a template to the first frame, the template including five points and three straight lines; and moving the template such that the point of interest or the line of interest is adjacent to or coincides with the five points or the three straight lines. 7 . The method according to claim 1 , wherein the converting the plurality of first frames comprises: removing a partial region of a first frame of the plurality of first frames; generating a second frame of the plurality of second frames by painting a missing area resulted from removal of the partial region; and arranging adjacent second frames by applying in-painting and flow estimation to the plurality of second frames. 8 . A system for automatically generating video reconstruction, the system comprising: a display configured to output a first video, and output a second video in which the first video is reconstructed; and a processor configured to process data for the first video and reconstruct the second video, wherein the processor is further configured to divide the first video into a plurality of first frames, determine a first object of interest from the plurality of first frames, and divide the plurality of first frames into a plurality of second frames based on the first object of interest, and reconstruct the first video into the second video based on the plurality of second frames. 9 . The system according to claim 8 , wherein the processor is further configured to divide the first video into a plurality of scenes based on images included in the first video or a text externally input; detect a second object included in the plurality of scenes and tracking the second object; and classify a foreground and a background in the plurality of scenes, and determining the second object as the first object of interest based on a result of classification. 10 . The system according to claim 9 , wherein the processor is further configured to detect voices included in the plurality of first frames through automatic speech recognition (ASR), and converting the voices into text, divide the images included in the plurality of first frames based on at least one of a color, a shape, or a gradation of each of the images; and generate a feature vector for each of the converted text and the divided images, and divide the first video into the plurality of scenes based on the feature vector. 11 . The system according to claim 8 , wherein the processor is further configured to determine the first object of interest based on an intent recognition and an entity recognition. 12 . The system according to claim 8 , wherein the processor is further configured to extract at least one of a point of interest or a line of interest for a third object included in a first frame of the plurality of first frames; and cut the third object included in the first frame or reconstructing the first frame based on the at least one of the point of interest or the line of interest. 13 . The system according to claim 12 , wherein the processor is further configured to fit a template to the first frame, the template including five points and three straight lines; and move the template such that the point of interest or the line of interest is adjacent to or coincides with the five points or the three straight lines. 14 . The system according to claim 8 , wherein the processor is further configured to remove a partial region of a first frame of the plurality of first frames, generate a second frame of the plurality of second frames by painting a missing area resulted from removal of the partial region; and arrange adjacent second frames by applying in-painting and flow estimation to the plurality of second frames. 15 . A computer program product comprising a non-transitory computer-readable medium storing instructions that, when executed by at least one hardware processor, cause the at least one hardware processor to perform operations comprising: dividing a first video into a plurality of first frames; determining an object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames based on the object of interest; and reconstructing the first video into a second video based on the plurality of second frames.

Assignees

Inventors

Classifications

  • Recognition using electronic means · CPC title

  • Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes · CPC title

  • Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title

  • G06V10/235Primary

    based on user input or interaction · CPC title

  • Surveillance or monitoring of activities, e.g. for recognising suspicious objects (recognising microscopic objects G06V20/69) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022207851A1 cover?
A system and a method for an automatic video reconstruction to improve scene quality using a dynamic point of interest by finding a point or line of interest are provided. The method includes dividing a first video into a plurality of first frames; determining a first object of interest in the plurality of first frames; converting the plurality of first frames into a plurality of second frames …
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/235. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 30 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).