Method and apparatus for processing video image and computer readable medium

US10776970B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10776970-B2
Application numberUS-201916709551-A
CountryUS
Kind codeB2
Filing dateDec 10, 2019
Priority dateAug 19, 2016
Publication dateSep 15, 2020
Grant dateSep 15, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present application provide a method and an apparatus for processing a video image. The method includes: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position, the business object in the background area of the video image by means of computer graphics.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a video image, comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position, the business object in the background area of the video image by means of computer graphics, wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to the action detection data and preset action data, wherein the action detection data comprises data of an action performed by the target object, and the preset action data comprises data of a preset action of the target object. 2. The method according to claim 1 , wherein the action comprises at least one of an action of head, an action of hand, or an action of body. 3. The method according to claim 1 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image using a pre-trained convolutional neural network model and the action detection data. 4. The method according to claim 1 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to a type of the business object and the action detection data. 5. The method according to claim 4 , wherein the determining the display position of the business object in the video image according to a type of the business object and the action detection data comprises: obtaining a plurality of display positions of the business object in the video image according to the action detection data and the type of the business object; and selecting at least one display position from the plurality of display positions as the final display position of the business object in the video image. 6. The method according to claim 1 , wherein the determining the display position of the business object in the video image according to preset action data and the action detection data comprises: determining whether the action detection data matches the preset action data; and in response to determining that the action detection data matches the preset action data, obtaining a target display position corresponding to the preset action data as the display position of the business object in the video image. 7. The method according to claim 6 , wherein the determining whether the action detection data matches the preset action data comprises: determining a plurality of matching degrees between the action detection data and a plurality of pieces of preset action data; obtaining the maximum matching degree of the plurality of matching degrees; and in response to the maximum matching degree being greater than a preset matching threshold, determining that a piece of preset data having the maximum matching degree matches the action detection data. 8. An apparatus for processing a video image, the apparatus comprising: a processor; and a memory storing instructions to cause the processor to perform operations, the operations comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position; the business object in the background area of the video image by means of computer graphics, wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to the action detection data and preset action data, wherein the action detection data comprises data of an action performed by the target object, and the preset action data comprises data of a preset action of the target object. 9. The apparatus according to claim 8 , wherein the action comprises at least one of an action of head, an action of hand, or an action of body. 10. The apparatus according to claim 8 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image using a pre-trained convolutional neural network model and the action detection data. 11. The apparatus according to claim 8 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to a type of the business object and the action detection data. 12. The apparatus according to claim 11 , wherein the determining the display position of the business object in the video image according to a type of the business object and the action detection data comprises: obtaining a plurality of display positions of the business object in the video image according to the action detection data and the type of the business object; and selecting at least one display position from the plurality of display positions as the final display position of the business object in the video image. 13. The apparatus according to claim 8 , wherein the determining the display position of the business object in the video image according preset action data and the action detection data comprises: determining whether the action detection data matches the preset action data; and in response to determining that the action detection data matches the preset action data, obtaining a target display position corresponding to the preset action data as the display position of the business object in the video image. 14. The apparatus according to claim 13 , wherein the determining whether the action detection data matches the preset action data comprises: determining a plurality of matching degrees between the action detection data and a plurality of pieces of preset action data; obtaining the maximum matching degree of the plurality of matching degrees; and in response to the maximum matching degree being greater than a preset matching threshold, determining that a piece of preset data having the maximum matching degree matches the action detection data. 15. A non-transitory computer readable medium, storing a computer program thereon, the program, when executed by a processor, causes the processor to perform operations, the operations comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with t

Assignees

Inventors

Classifications

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • Learning methods · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10776970B2 cover?
Embodiments of the present application provide a method and an apparatus for processing a video image. The method includes: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image;…
Who is the assignee on this patent?
Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?
Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).