Image processing method and apparatus, and storage medium

US11450080B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11450080-B2
Application numberUS-202017088558-A
CountryUS
Kind codeB2
Filing dateNov 3, 2020
Priority dateNov 19, 2018
Publication dateSep 20, 2022
Grant dateSep 20, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An image processing method and apparatus, and a storage medium are provided. The method includes: detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object; adjusting the current detection region according to a historic detection region corresponding to the target object in a historic video frame of the target video stream, to obtain a determined current detection region; performing key point positioning on the target object based on the determined current detection region, to obtain a first set of key points; and performing stabilization on locations of the key points in the first set according to locations of key points in a second set corresponding to the target object in the historic video frame, to obtain current locations of a set of key points of the target object in the current video frame.

First claim

Opening claim text (preview).

What is claimed is: 1. An image processing method, applied to a terminal device, and comprising: detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object; obtaining a historic detection region corresponding to the target object in a historic video frame of the target video stream; adjusting the current detection region according to the historic detection region, to obtain a determined current detection region; performing key point positioning on the target object based on the determined current detection region, to obtain a first set of key points; obtaining a second set of key points corresponding to the target object in the historic video frame of the target video stream; and performing stabilization on locations of the key points in the first set according to locations of the key points in the second set, to obtain current locations of a set of key points of the target object in the current video frame, including: determining locations of first target object key points that are to be stabilized from the first set of key points; determining locations of second target object key points corresponding to a part indicated by the first target object key points from the second set of key points; performing weighting on the determined locations of all the second target object key points and the corresponding locations of the first target object key points, to obtain a weighted sum; determining a target coefficient by using a frame rate of the target video stream; and performing smoothing on the locations of the first target object key points according to the weighted sum and the target coefficient, to obtain stabilized locations of the first target object key points. 2. The method according to claim 1 , wherein the adjusting the current detection region according to the historic detection region, to obtain a determined current detection region comprises: determining an intersection over union between the historic detection region and the current detection region; using the historic detection region as the determined current detection region when the intersection over union is greater than a target threshold; and using the current detection region as the determined current detection region when the intersection over union is less than or equal to the target threshold. 3. The method according to claim 1 , wherein the detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object comprises: detecting the current video frame, to obtain a plurality of first candidate detection regions; and determining a first candidate detection region having a maximum intersection over union with the historic detection region from the plurality of first candidate detection regions as the current detection region. 4. The method according to claim 1 , wherein before the detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object, the method further comprises: detecting a first video frame of the target video stream, to obtain a plurality of second candidate detection regions; using a second candidate detection region having a maximum confidence level in the plurality of second candidate detection regions as a detection region corresponding to the first video frame, and then using the detection region corresponding to the first video frame as a historic detection region of another video frame in the target video stream. 5. The method according to claim 1 , wherein the performing key point positioning on the target object based on the determined current detection region, to obtain a first set of key points comprises: performing, when the target object in the current video frame is partially located in the determined current detection region, expanding on the determined current detection region by centering around a center of the determined current detection region, to obtain a target detection region; and obtaining the first set of key points according to a target image comprising the target object in the target detection region. 6. The method according to claim 5 , wherein the obtaining the first set of key points according to a target image comprising the target object in the target detection region comprises: processing the target image to obtain a plurality of groups of confidence levels of the first set of key points, each group of confidence levels being used for predicting a location of one object key point in the first set of key points; constructing a target matrix by using the each group of confidence levels; determining first target coordinates according to a row and a column of a maximum confidence level in the each group of confidence levels in the corresponding target matrix; and determining the location of the one object key point in the first set of key points according to the first target coordinates. 7. The method according to claim 6 , wherein the determining the location of the one object key point in the first set of key points according to the first target coordinates comprises: determining second target coordinates according to a row and a column of a second maximum confidence level in the each group of confidence levels in the target matrix; offsetting the first target coordinates toward the second target coordinates by a target distance; and determining, according to first target coordinates that are offset by the target distance, a location of the one object key point corresponding to the target matrix on the target object. 8. The method according to claim 1 , further comprising: recognizing a part of the target object from the current video frame according to the current locations of the set of key points of the target object; performing adjustment on the recognized part of the target object; and displaying an image of the target object after the adjustment. 9. An image processing apparatus, comprising: a memory and a processor, the memory being configured to store a computer program; and the processor being configured to run the computer program, to perform the following actions: detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object; obtaining a historic detection region corresponding to the target object in a historic video frame of the target video stream; adjusting the current detection region according to the historic detection region, to obtain a determined current detection region; performing key point positioning on the target object based on the determined current detection region, to obtain a first set of key points; obtaining a second set of key points corresponding to the target object in the historic video frame of the target video stream; and performing stabilization on locations of the key points in the first set according to locations of the key points in the second set, to obtain current locations of a set of key points of the target object in the current video frame, including: determining locations of first target object key points that are to be stabilized from the first set of key points; determining locations of second target object key points corresponding to a part indicated by the first target object key points from the second set of key points; performing weighting on the determined locations of all the second target object key points and the corresponding locations of the first target object key points, to obtain a weighted sum; determining a target coefficient by using a frame rate of the target video stream; and performing smoothing on the locations

Assignees

Inventors

Classifications

  • G06V40/23Primary

    Recognition of whole body movements, e.g. for sport training · CPC title

  • relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • G06V10/22Primary

    by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11450080B2 cover?
An image processing method and apparatus, and a storage medium are provided. The method includes: detecting a target object in a current video frame of a target video stream, to obtain a current detection region for the target object; adjusting the current detection region according to a historic detection region corresponding to the target object in a historic video frame of the target video s…
Who is the assignee on this patent?
Tencent Tech Shenzhen Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V40/23. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 20 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).