Who is the assignee on this patent?

Beijing Sensetime Tech Development Co Ltd

What technology area does this patent fall under?

Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for processing video image and computer readable medium

US10776970B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10776970-B2
Application number	US-201916709551-A
Country	US
Kind code	B2
Filing date	Dec 10, 2019
Priority date	Aug 19, 2016
Publication date	Sep 15, 2020
Grant date	Sep 15, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present application provide a method and an apparatus for processing a video image. The method includes: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position, the business object in the background area of the video image by means of computer graphics.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a video image, comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position, the business object in the background area of the video image by means of computer graphics, wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to the action detection data and preset action data, wherein the action detection data comprises data of an action performed by the target object, and the preset action data comprises data of a preset action of the target object. 2. The method according to claim 1 , wherein the action comprises at least one of an action of head, an action of hand, or an action of body. 3. The method according to claim 1 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image using a pre-trained convolutional neural network model and the action detection data. 4. The method according to claim 1 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to a type of the business object and the action detection data. 5. The method according to claim 4 , wherein the determining the display position of the business object in the video image according to a type of the business object and the action detection data comprises: obtaining a plurality of display positions of the business object in the video image according to the action detection data and the type of the business object; and selecting at least one display position from the plurality of display positions as the final display position of the business object in the video image. 6. The method according to claim 1 , wherein the determining the display position of the business object in the video image according to preset action data and the action detection data comprises: determining whether the action detection data matches the preset action data; and in response to determining that the action detection data matches the preset action data, obtaining a target display position corresponding to the preset action data as the display position of the business object in the video image. 7. The method according to claim 6 , wherein the determining whether the action detection data matches the preset action data comprises: determining a plurality of matching degrees between the action detection data and a plurality of pieces of preset action data; obtaining the maximum matching degree of the plurality of matching degrees; and in response to the maximum matching degree being greater than a preset matching threshold, determining that a piece of preset data having the maximum matching degree matches the action detection data. 8. An apparatus for processing a video image, the apparatus comprising: a processor; and a memory storing instructions to cause the processor to perform operations, the operations comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; performing an action detection on the target object in the foreground area to obtain action detection data; determining a display position of the business object in the video image according to the action detection data; and drawing, according to the display position; the business object in the background area of the video image by means of computer graphics, wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to the action detection data and preset action data, wherein the action detection data comprises data of an action performed by the target object, and the preset action data comprises data of a preset action of the target object. 9. The apparatus according to claim 8 , wherein the action comprises at least one of an action of head, an action of hand, or an action of body. 10. The apparatus according to claim 8 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image using a pre-trained convolutional neural network model and the action detection data. 11. The apparatus according to claim 8 , wherein the determining a display position of the business object in the video image according to the action detection data comprises: determining the display position of the business object in the video image according to a type of the business object and the action detection data. 12. The apparatus according to claim 11 , wherein the determining the display position of the business object in the video image according to a type of the business object and the action detection data comprises: obtaining a plurality of display positions of the business object in the video image according to the action detection data and the type of the business object; and selecting at least one display position from the plurality of display positions as the final display position of the business object in the video image. 13. The apparatus according to claim 8 , wherein the determining the display position of the business object in the video image according preset action data and the action detection data comprises: determining whether the action detection data matches the preset action data; and in response to determining that the action detection data matches the preset action data, obtaining a target display position corresponding to the preset action data as the display position of the business object in the video image. 14. The apparatus according to claim 13 , wherein the determining whether the action detection data matches the preset action data comprises: determining a plurality of matching degrees between the action detection data and a plurality of pieces of preset action data; obtaining the maximum matching degree of the plurality of matching degrees; and in response to the maximum matching degree being greater than a preset matching threshold, determining that a piece of preset data having the maximum matching degree matches the action detection data. 15. A non-transitory computer readable medium, storing a computer program thereon, the program, when executed by a processor, causes the processor to perform operations, the operations comprising: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with t

Assignees

Beijing Sensetime Tech Development Co Ltd

Inventors

Classifications

G06N3/048
Activation functions · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/08
Learning methods · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 61197301

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10776970B2 cover?: Embodiments of the present application provide a method and an apparatus for processing a video image. The method includes: obtaining a video image to be processed and a business object to be displayed, wherein the video image comprises a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image;…
Who is the assignee on this patent?: Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?: Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Iterative recognition-guided thresholding and data extraction

System and method for appearance search

System and method for training object classifier by machine learning

Systems and Methods for Associating an Image with a Business Venue by using Visually-Relevant and Business-Aware Semantics

Iterative recognition-guided thresholding and data extraction

Dynamic product placement in media content

Frequently asked questions