Who is the assignee on this patent?

Beijing Sensetime Tech Development Co Ltd

What technology area does this patent fall under?

Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for processing video image and electronic device

US10580179B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10580179-B2
Application number	US-201715845802-A
Country	US
Kind code	B2
Filing date	Dec 18, 2017
Priority date	Aug 19, 2016
Publication date	Mar 3, 2020
Grant date	Mar 3, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present application provide a method and an apparatus for processing a video image and an electronic device, relating to the field of artificial intelligent technologies, wherein the method includes: obtaining a video image to be processed and a business object to be displayed; determining a background area of the video image; and drawing the business object in the background area of the video image by means of computer graphics. The embodiments of the present application may realize display of the business object in the background area of the video image, so that the business object may be prevented from blocking a foreground area favorably, the normal video viewing experience of an audience is not influenced, the dislike of the audience is not easy to be aroused, and an expected display effect of the business object may be achieved favorably.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a video image, comprising: obtaining a video image to be processed and a business object to be displayed, the video image comprising a background area and a foreground area comprising a target object non-overlapping with the background area; determining the background area of the video image; determining a display position of the business object in the video image; determining whether the business object overlaps with the target object in the foreground area based on the display position; and in response to determining that the business object overlaps with the target object, drawing a portion of the business object overlapping with the background area of the video image by means of computer graphics, wherein the entire target object is presented without presenting each portion of the business object overlapping with the target object. 2. The method according to claim 1 , wherein the video image comprises a plurality of display layers, and the drawing the portion of the business object overlapping with the background area of the video image by means of computer graphics comprises: placing a display layer of the business object below a display layer of the foreground area. 3. The method according to claim 1 , wherein the drawing the portion of the business object overlapping with the background area of the video image by means of computer graphics comprises: drawing the portion of the business object overlapping with the background area by means of computer graphics to enable the business object to cover raw content of the background area. 4. The method according to claim 1 , wherein the determining the background area of the video image comprises: determining the background area of the video image by a pre-trained first convolutional neural network model. 5. The method according to claim 4 , wherein the determining the background area of the video image by a pre-trained first convolutional neural network model comprises: obtaining a first image feature vector of the video image using the first convolutional neural network model; performing convolution processing on the first image feature vector using the first convolutional neural network model to obtain a convolution result of the first image feature vector; performing amplification processing on the convolution result of the first image feature vector; and determining the background area based on the amplified convolution result of the first image feature vector. 6. The method according to claim 4 , wherein pre-training of the first convolutional neural network model comprises: obtaining a first feature vector of a first sample image using the first convolutional neural network model, wherein the first sample image is a sample image containing foreground tag information and background tag information; performing convolution processing on the first feature vector using the first convolutional neural network model to obtain a convolution result of the first feature vector; performing amplification processing on the convolution result of the first feature vector; judging whether the amplified convolution result of the first feature vector meets a convolution convergence condition; if the amplified convolution result of the first feature vector meets the convolution convergence condition, completing training of the first convolutional neural network model; and if the amplified convolution result of the first feature vector does not meet the convolution convergence condition, adjusting a network parameter of the first convolutional neural network model based on the amplified convolution result of the first feature vector, and performing iterative training on the first convolutional neural network model according to the adjusted network parameter of the first convolutional neural network model, until the convolution result of the first feature vector after the iterative training meets the convolution convergence condition. 7. The method according to claim 1 , wherein the determining the display position of the business object in the video image comprises: performing an action detection on the target object in the foreground area to obtain action detection data; and determining the display position of the business object in the video image according to the action detection data of the target object in the foreground area. 8. The method according to claim 7 , wherein the determining the display position of the business object in the video image according to the action detection data of the target object in the foreground area comprises: determining the display position of the business object in the video image by a pre-trained second convolutional neural network model according to the action detection data of the target object in the foreground area. 9. The method according to claim 8 , wherein pre-training of the second convolutional neural network model comprises: obtaining a second feature vector of a second sample image using the second convolutional neural network model, wherein the second feature vector contains position information and/or confidence information of the business object in the second sample image, as well as a target object feature vector of a target object in the second sample image; performing convolution processing on the second feature vector using the second convolutional neural network model to obtain a convolution result of the second feature vector; judging whether position information and/or confidence information of a corresponding business object in the convolution result of the second feature vector meet a business object convergence condition, and judging whether the target object feature vector in the convolution result of the second feature vector meets a target object convergence condition; if the both are met, completing training of the second convolutional neural network model; and otherwise, adjusting a network parameter of the second convolutional neural network model and performing iterative training on the second convolutional neural network model according to the adjusted the network parameter of the second convolutional neural network model, until the position information and/or the confidence information of the business object after the iterative training as well as the target object feature vector after the iterative training meet a corresponding convergence condition. 10. The method according to claim 7 , wherein the determining the display position of the business object in the video image according to the action detection data of the target object in the foreground area comprises: determining the display position of the business object in the video image according to action detection data of the target object in the foreground area and according to a type of the business object. 11. The method according to claim 7 , wherein the determining the display position of the business object in the video image according to the action detection data of the target object in the foreground area comprises: judging whether the action detection data of the target object in the foreground area matches preset action data; and if the action detection data of the target object in the foreground area matches the preset action data, obtaining, from a corresponding relationship between pre-stored action data and display position, a target display position corresponding to the preset action data as the display position of the business object in the video image. 12. An apparatus for processing a video image, comprising: a processor; and instructions to cause the processor to perform operations, the operations comprising:

Assignees

Beijing Sensetime Tech Development Co Ltd

Inventors

Classifications

H04N21/4666
using neural networks, e.g. processing the feedback provided by the user · CPC title
H04N21/44008
involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream (arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title
H04N21/44016
involving splicing one content stream with another content stream, e.g. for substituting a video clip · CPC title
H04N21/812
involving advertisement data (advertising per se G06Q30/02) · CPC title
G11B27/036
Insert-editing · CPC title

Patent family

Related publications grouped by family.

View patent family 61197301

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10580179B2 cover?: Embodiments of the present application provide a method and an apparatus for processing a video image and an electronic device, relating to the field of artificial intelligent technologies, wherein the method includes: obtaining a video image to be processed and a business object to be displayed; determining a background area of the video image; and drawing the business object in the background…
Who is the assignee on this patent?: Beijing Sensetime Tech Development Co Ltd
What technology area does this patent fall under?: Primary CPC classification H04N21/23418. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Iterative recognition-guided thresholding and data extraction

System and method for training object classifier by machine learning

Systems and Methods for Associating an Image with a Business Venue by using Visually-Relevant and Business-Aware Semantics

Iterative recognition-guided thresholding and data extraction

Dynamic product placement in media content

Frequently asked questions