Video image processing method, apparatus, and device, and storage medium

US2024020811A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024020811-A1
Application numberUS-202318476301-A
CountryUS
Kind codeA1
Filing dateSep 27, 2023
Priority dateMar 31, 2021
Publication dateJan 18, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a video image processing method, apparatus, and device, and a storage medium. The method includes: obtaining a first target-frame video image, the first target-frame video image being one of to-be-processed multi-frame video images; performing target detection on the first target-frame video image to determine at least one target object in the first target-frame video image; determining at least one first to-be-processed target object of the at least one target object based on a predetermined classification rule of a to-be-processed target object; and replacing, in the first target-frame video image, the at least one first to-be-processed target object with a predetermined target substitute to obtain a second target-frame video image. Data volume of the predetermined target substitute is smaller than date volume of the at least one first to-be-processed target object. Data volume of a video is reduced without affecting an actual output effect using the method.

First claim

Opening claim text (preview).

What is claimed is: 1 . A video image processing method, comprising: obtaining a first target-frame video image, the first target-frame video image being one of to-be-processed multi-frame video images; performing target detection on the first target-frame video image to determine at least one target object in the first target-frame video image; determining at least one first to-be-processed target object from the at least one target object based on a predetermined classification rule for a to-be-processed target object; and replacing, in the first target-frame video image, the at least one first to-be-processed target object with a predetermined target substitute to obtain a second target-frame video image, wherein data volume of the predetermined target substitute is smaller than date volume of the at least one first to-be-processed target object. 2 . The method according to claim 1 , wherein said performing the target detection on the first target-frame video image to determine the at least one target object in the first target-frame video image comprises: inputting the first target-frame video image into a target detection model for target detection, to obtain a first target detection result, the first target detection result comprising the at least one target object in the first target-frame video image. 3 . The method according to claim 2 , wherein: the first target detection result further comprises type information and first position information of each of the at least one target object; and said determining the at least one first to-be-processed target object from the at least one target object based on the predetermined classification rule for the to-be-processed target object comprises: determining a first influence factor corresponding to each of the at least one target object based on the first position information and the type information of each of the at least one target object; and determining, from the at least one target object, a target object corresponding to a first influence factor that satisfies a first predetermined condition as the at least one first to-be-processed target object. 4 . The method according to claim 3 , wherein: the first target detection result further comprises first physical attribute information of the at least one first to-be-processed target object; and said replacing, in the first target-frame video image, the at least one first to-be-processed target object with the predetermined target substitute to obtain the second target-frame video image comprises: performing, in the first target-frame video image, semantic segmentation on the at least one first to-be-processed target object based on the first position information of the at least one first to-be-processed target object, to obtain a segmentation region corresponding to the at least one first to-be-processed target object; determining the predetermined target substitute corresponding to the at least one first to-be-processed target object based on the type information and the first physical attribute information of the at least one first to-be-processed target object; replacing, in the corresponding segmentation region, the at least one first to-be-processed target object with the corresponding predetermined target substitute to obtain a replaced first target-frame video image; and smoothing, in the replaced first target-frame video image, an edge contour of the corresponding segmentation region to obtain the second target-frame video image. 5 . The method according to claim 3 , wherein: when the at least one first to-be-processed target object comprises a plurality of first to-be-processed target objects, the first target detection result further comprises first physical attribute information of the plurality of first to-be-processed target objects; and said replacing, in the corresponding segmentation region, the at least one first to-be-processed target object with the corresponding predetermined target substitute to obtain the second target-frame video image comprises: performing, in the first target-frame video image, instance segmentation on the plurality of first to-be-processed target objects based on the first position information of the plurality of first to-be-processed target objects, to obtain a plurality of segmentation regions corresponding to the plurality of first to-be-processed target objects; determining, based on the type information and the first physical attribute information of the plurality of first to-be-processed target objects, a plurality of predetermined target substitutes corresponding to the plurality of first to-be-processed target objects, respectively; replacing, in the corresponding plurality of segmentation regions, the plurality of first to-be-processed target objects with the corresponding plurality of predetermined target substitutes respectively, to obtain a replaced first target-frame video image; and smoothing, in the replaced first target-frame video image, edge contours of the corresponding plurality of segmentation regions, to obtain the second target-frame video image. 6 . The method according to claim 3 , wherein when the first predetermined condition comprises a second predetermined condition, the method further comprises, subsequent to said determining, from the at least one target object, the target object corresponding to the first influence factor that satisfies the first predetermined condition as the at least one first to-be-processed target object: determining, from the at least one first to-be-processed target object, a first to-be-processed target object corresponding to a first influence factor that satisfies the second predetermined condition as a second to-be-processed target object, the method further comprises, subsequent to said replacing, in the first target-frame video image, the at least one first to-be-processed target object with the predetermined target substitute to obtain the second target-frame video image: obtaining a next-frame video image of the first target-frame video image; inputting the next-frame video image into the target detection model for the target detection, to obtain a second target detection result, wherein when the second target detection result comprises the second to-be-processed target object, the second target detection result further comprises second position information of the second to-be-processed target object; determining a second influence factor of the second to-be-processed target object based on the type information and the second position information of the second to-be-processed target object; determining whether the second influence factor satisfies the first predetermined condition; and replacing, in response to determining that the second influence factor does not satisfy the first predetermined condition, a predetermined target substitute corresponding to the second to-be-processed target object with the second to-be-processed target object. 7 . A video image processing device, comprising: a processor; and a memory having at least one instruction or program stored thereon, wherein the at least one instruction or program is loaded and executed by the processor to implement operations comprising: obtaining a first target-frame video image, the first target-frame video image being one of to-be-processed multi-frame video images; performing target detection on the first target-frame video image to determine at least one target object in the first target-frame video image; determining at least one first to-be-processed target object from the at least one target object based on a predetermined classification rule for a to-be-processed target object; and replacing, in the first target-frame video image, the at least

Assignees

Inventors

Classifications

  • involving operations for analysing video streams, e.g. detecting features or characteristics (television picture signal circuitry for scene change detection H04N5/147; filtering for image enhancement G06T5/00; methods or arrangements for recognising scenes G06V20/00; arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title

  • G06T5/50Primary

    using two or more images, e.g. averaging or subtraction · CPC title

  • G06V10/764Primary

    using classification, e.g. of video objects · CPC title

  • Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title

  • Edge-based segmentation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024020811A1 cover?
Provided is a video image processing method, apparatus, and device, and a storage medium. The method includes: obtaining a first target-frame video image, the first target-frame video image being one of to-be-processed multi-frame video images; performing target detection on the first target-frame video image to determine at least one target object in the first target-frame video image; determi…
Who is the assignee on this patent?
Geely Holding Group Co Ltd, Geely Automobile Res Institute Ningbo Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T5/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 18 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).