Method and apparatus for determining region of interest

US11659181B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11659181-B2
Application numberUS-202016892910-A
CountryUS
Kind codeB2
Filing dateJun 4, 2020
Priority dateDec 19, 2019
Publication dateMay 23, 2023
Grant dateMay 23, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method and apparatus for processing a video. The method may include: acquiring object regions obtained by performing object detection on a target video frame, a type of an object in each of the object regions being a preset type; determining, for an object region in the acquired object regions, in response to determining that the object region satisfies a preset condition, that the object region is a non-ROI; using an object region other than the non-ROI in the object regions of the target video frame as a ROI; and acquiring a quantization parameter change corresponding to each ROI, and encoding the target video frame based on the quantization parameter change.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a video, the method comprising: acquiring object regions obtained by performing object detection on a target video frame, a type of an object in each of the object regions being a preset type; determining, for an object region in the acquired object regions, in response to determining that the object region satisfies a preset condition, that the object region is a non-ROI (region of interest); using object regions other than the non-ROI in the object regions of the target video frame as ROIs; and acquiring a quantization parameter change corresponding to each of the ROIs, and encoding the target video frame based on the quantization parameter change corresponding to each of the ROIs; wherein acquiring the quantization parameter change corresponding to each of the ROIs, comprises: determining, for each of the ROIs in the target video frame, the quantization parameter change corresponding to each of the ROIs based on the type of the object in each of the ROIs, the type of the object in each of the ROIs being one of at least one preset type, by: acquiring a maximum value of a sum of increase ratios of code rates of types of ROIs in the target video frame, wherein the maximum value of the sum of the increase ratios of the code rates is obtained relative to types of ROIs obtained by encoding based on a specified quantization parameter of an encoder; acquiring a preset constant corresponding to the type of the object in each of the ROIs, wherein different preset types correspond to different preset constants, and the quantization parameter change is determined based on the preset constant and a constant coefficient; and determining a quantization parameter change corresponding to each type of ROI in the target video frame based on the maximum value of the sum of the increase ratios of the code rates, the preset constant corresponding to each type of ROI, and a ratio of each type of ROI to the target video frame. 2. The method according to claim 1 , wherein the determining, for the object region in the acquired object regions, in response to determining that the object region satisfies the preset condition, that the object region is the non-ROI, comprises: determining, for the object region in the acquired object regions, whether a ratio of the object region to the target video frame is greater than a preset ratio; and determining that the object region is the non-ROI in response to determining that the ratio of the object region to the target video frame is greater than the preset ratio. 3. The method according to claim 2 , wherein the determining that the object region is the non-ROI in response to determining that the ratio of the object region to the target video frame is greater than the preset ratio, comprises: determining, in response to determining that the ratio of the object region to the target video frame is greater than the preset ratio and the type of the object in the object region is a preset type with a highest priority, that the object regions in the target video frame are non-ROIs, a priority being used to represent a priority for determining each type of object region as a ROI. 4. The method according to claim 1 , wherein, for object regions of a same type, an object region having a larger area has a higher priority, and a priority is used to represent a priority for determining each type of object region as a ROI. 5. The method according to claim 1 , wherein, for object regions respectively containing a head and a body of a same person, in response to determining that a ratio of the object region containing the head to the target video frame is greater than a specified threshold, the object region containing the head has a priority higher than a priority of the object region containing the body, or in response to determining that the ratio of the object region containing the head to the target video frame is not greater than the specified threshold, the object region containing the body has a priority higher than a priority of the object region containing the head, both types of the head and the body are preset types. 6. The method according to claim 1 , wherein the determining, for the object region in the acquired object regions, in response to determining that the object region satisfies the preset condition, that the object region is the non-ROI, comprises: matching two object regions respectively containing a head and a body of a same person in the object regions as an associated region group, based on a positional relationship between the object regions and types of objects in the object regions; and determining, for a first associated region group, in response to determining that a ratio of a head in an object region of a head type in the first associated region group to the target video frame exceeds a specified threshold, that another object region of the first associated region group is the non-ROI, or in response to determining that the ratio of the head in the object region of the head type to the target video frame does not exceed the specified threshold, that the object region of the head type is the non-ROI. 7. The method according to claim 1 , wherein the determining, for the object region in the acquired object regions, in response to determining that the object region satisfies the preset condition, that the object region is the non-ROI, comprises: determining, for the object region in the acquired object regions, a priority of the object region based on a type of the object region; and determining, in response to determining that the object region satisfies the preset condition based on the priority of the object region, that the object region is the non-ROI. 8. The method according to claim 7 , wherein the determining, in response to determining that the object region satisfies the preset condition based on the priority of the object region, that the object region is the non-ROI, comprises: selecting, for the object regions, a preset number of the object regions according to priorities of the object regions in a descending order; and using an object region other than the selected object regions as the non-ROI, in response to determining that ratios of the selected object regions to the target video frame do not exceed a preset ratio. 9. The method according to claim 1 , wherein both a head type and a body type belong to a human body category; and the determining, for the object region in the acquired object regions, in response to determining that the object region satisfies the preset condition, that the object region is the non-ROI, comprises: acquiring, in response to the target video frame containing an object region of the human body category and an object region of a text type, a target quantization parameter change corresponding to each object region of the human body category in the target video frame, wherein the target quantization parameter change is determined based on a preset constant and a specified constant coefficient, and different preset types correspond to different preset constants; predicting, for each object region of the human body category in the target video frame, based on the target quantization parameter change corresponding to the object region and a ratio of the object region to the target video frame, an increase ratio of the object region relative to an object region obtained by encoding based on a specified quantization parameter of an encoder, and determining a sum of increase ratios of code rates of each object region of the human body category; and determining, in response to the sum of the increase ratios of the code rates of each object region of the human body category exceeding a preset increase

Assignees

Inventors

Classifications

  • the classifiers operating on different input data, e.g. multi-modal recognition · CPC title

  • by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition · CPC title

  • characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding · CPC title

  • Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title

  • characterised by the element, parameter or selection affected or controlled by the adaptive coding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11659181B2 cover?
Embodiments of the present disclosure relate to a method and apparatus for processing a video. The method may include: acquiring object regions obtained by performing object detection on a target video frame, a type of an object in each of the object regions being a preset type; determining, for an object region in the acquired object regions, in response to determining that the object region s…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N19/124. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 23 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).