Method and apparatus for processing video frame

US11375209B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11375209-B2
Application numberUS-202016895430-A
CountryUS
Kind codeB2
Filing dateJun 8, 2020
Priority dateDec 19, 2019
Publication dateJun 28, 2022
Grant dateJun 28, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method and apparatus for processing a video frame. The method may include: acquiring a sequence of video frames of a video; ascertaining, in the sequence of the video frames, a previous frame, and ascertaining, in the sequence of the video frames, a subsequent frame corresponding to the ascertained previous frame based on acquired number of frames from the previous frame to the subsequent frame. An update step is performed as follows: acquiring object regions detected respectively in the ascertained previous frame and the ascertained subsequent frame, and confidence levels of the object regions; fusing a confidence level of a first object region and a confidence level of a second object region, and updating the confidence level of the second object region based on the fusion result; and updating the ascertained previous frame and the ascertained subsequent frame.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a video frame, comprising: acquiring a sequence of video frames of a video; ascertaining, in the sequence of the video frames, a previous frame, and ascertaining, in the sequence of the video frames, a subsequent frame corresponding to the ascertained previous frame based on acquired number of frames from the previous frame to the subsequent frame, the acquired number representing the number of frames between the previous frame and the subsequent frame; and performing an update process, comprising: acquiring object regions detected respectively in the ascertained previous frame and the ascertained subsequent frame, and confidence levels of the object regions; ascertaining respectively a first object region and a second object region containing a same object from an object region of the ascertained previous frame and an object region of the ascertained subsequent frame; fusing a confidence level of the first object region and a confidence level of the second object region, and updating the confidence level of the second object region based on the fusion result, the fusion result being obtained by summing a product of a weight of the confidence level of the first object region and the confidence level of the first object region and a product of a weight of the confidence level of the second object region and the confidence level of the second object region; and updating the ascertained previous frame and the ascertained subsequent frame, the updated previous frame being the subsequent frame before the updating, wherein the method further comprises: performing the update process again, in response to the subsequent frame before the updating not being a last frame of the sequence of the video frames, and wherein after updating the confidence level of the second object region based on the fusion result, the update process further comprises: determining whether the same object is present in the object region detected in the ascertained subsequent frame and in the object region detected in the ascertained previous frame; and reducing, in response to determining that the same object is not present in the object region detected in the ascertained subsequent frame and in the object region detected in the ascertained previous frame, the number of frames between the ascertained previous frame and the ascertained subsequent frame, to update the ascertained subsequent frame to be a new subsequent frame according to the reduced number of frames, and performing the update process again by using the ascertained previous frame and the new subsequent frame. 2. The method according to claim 1 , wherein the ascertaining respectively the first object region and the second object region containing the same object from the object region of the ascertained previous frame and the object region of the ascertained subsequent frame comprises: ascertaining objects contained in the first object region and an object contained in the second object region being the same object, in response to determining that a degree of overlap between the first object region detected in the ascertained previous frame and the second object region detected in the ascertained subsequent frame is greater than a preset threshold and types of the contained objects are consistent. 3. The method according to claim 1 , wherein before the acquiring the object regions detected respectively in the ascertained previous frame and the ascertained subsequent frame, the update process further comprises: changing the ascertained subsequent frame to the last frame of the sequence of the video frames in response to the ascertained subsequent frame not being in a range of the sequence of the video frames. 4. The method according to claim 1 , further comprising: ascertaining, from an object region having an updated confidence level in the ascertained subsequent frame, an object region having a confidence level less than a confidence level threshold; and ascertaining the object region having the confidence level less than the confidence level threshold as a non-object region. 5. The method according to claim 1 , wherein after updating the confidence level of the second object region based on the fusion result, the update step further comprises: for the acquired video frames, performing, in response to determining that the same object is present in the object region detected in the ascertained subsequent frame and in the object region detected in the ascertained previous frame, linear interpolation processing on a video frame between the ascertained previous frame and the ascertained subsequent frame, to obtain an object region having the same object in the video frame between the ascertained previous frame and the ascertained subsequent frame. 6. The method according to claim 5 , wherein after the updating the ascertained previous frame and the ascertained subsequent frame, the update process further comprises: determining whether the ascertained subsequent frame is the last frame of the sequence of the video frames; stopping performing the update process, in response to determining that the ascertained subsequent frame is the last frame of the sequence of the video frames; and performing the update process again, in response to determining that the ascertained subsequent frame is not the last frame of the sequence of the video frames. 7. The method according to claim 5 , wherein before the determining whether the same object is present in the object region detected in the ascertained subsequent frame and in the object region detected in the ascertained previous frame, the update process further comprises: determining, in the update process, whether the ascertained previous frame and the ascertained subsequent frame are adjacent frames; determining whether the ascertained subsequent frame is the last frame of the sequence of the video frames, in response to determining that the ascertained previous frame and the ascertained subsequent frame are the adjacent frames; stopping performing the update process, in response to determining that the ascertained subsequent frame is the last frame of the sequence of the video frames; and performing the update process again, in response to determining that the ascertained subsequent frame is not the last frame of the sequence of the video frames. 8. An apparatus for processing a video frame, comprising: at least one processor; and a memory storing instructions, the instructions when executed by the at least one processor, causing the at least one processor to perform operations, the operations comprising: acquiring a sequence of video frames of a video; ascertaining, in the sequence of the video frames, a previous frame, and ascertain, in the sequence of the video frames, a subsequent frame corresponding to the ascertained previous frame based on acquired number of frames from the previous frame to the subsequent frame, the acquired number representing the number of frames between the previous frame and the subsequent frame; performing an update process, comprising: acquiring object regions detected respectively in the ascertained previous frame and the ascertained subsequent frame, and confidence levels of the object regions; ascertaining respectively a first object region and a second object region containing a same object from an object region of the ascertained previous frame and an object region of the ascertained subsequent frame; fusing a confidence level of the first object region and a confidence level of the second object region, and updating the confidence level of the second object region based on the fusion result, the fusion result being obtained by summing a product of a weight of the confidence

Assignees

Inventors

Classifications

  • Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs · CPC title

  • H04N19/172Primary

    the region being a picture, frame or field · CPC title

  • Video; Image sequence · CPC title

  • G06T7/246Primary

    using feature-based methods, e.g. the tracking of corners or segments · CPC title

  • involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream (arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11375209B2 cover?
Embodiments of the present disclosure relate to a method and apparatus for processing a video frame. The method may include: acquiring a sequence of video frames of a video; ascertaining, in the sequence of the video frames, a previous frame, and ascertaining, in the sequence of the video frames, a subsequent frame corresponding to the ascertained previous frame based on acquired number of fram…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N19/172. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).