What technology area does this patent fall under?

Primary CPC classification G06V20/49. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Video data processing method and apparatus, device, and medium

US12094209B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12094209-B2
Application number	US-202217951621-A
Country	US
Kind code	B2
Filing date	Sep 23, 2022
Priority date	Dec 2, 2020
Publication date	Sep 17, 2024
Grant date	Sep 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the disclosure provide a data processing method and apparatus, a device, and a medium. The method includes: performing video analysis on video data of a target video to obtain a plurality of video segments; determining a video template associated with a target user from a video template database based on a user portrait of the target user, and obtaining at least one predetermined template segment and a template tag sequence in the video template; screening at least one video segment matching the template attribute tag of the at least one template segment; splicing the at least one matched video segment according to a position of a template attribute tag of each template segment in the template tag sequence as a video material segment of the target video; and pushing the video data and the video material segment to an application client corresponding to the target user.

First claim

Opening claim text (preview).

What is claimed is: 1. A video data processing method, performed by a computer device, the method comprising: obtaining video data of a target video requested by a target user, and performing video analysis on the video data to obtain a plurality of video segments, the video analysis comprising storyboard processing and attribute analysis based on a plurality of preset segment attribute tags, and each video segment in the plurality of video segments corresponding to one segment attribute tag and one storyboard segment; determining a video template associated with the target user from a video template database based on a user portrait of the target user, and obtaining at least one template segment and a template tag sequence in the video template, the template tag sequence being based on a template attribute tag of the at least one template segment; screening at least one video segment matching the template attribute tag of the at least one template segment from the plurality of video segments based on the template attribute tag of the at least one template segment and segment attribute tags corresponding to the plurality of video segments; splicing the at least one matched video segment according to a position of a template attribute tag, of each template segment in the at least one template segment, in the template tag sequence, as a video material segment of the target video; and pushing the video data and the video material segment to an application client corresponding to the target user, to be output, wherein the performing the video analysis comprises: performing the storyboard processing on a video sequence corresponding to the video data through a video partitioning component, to obtain a plurality of storyboard segments associated with the video sequence; inputting the plurality of storyboard segments into a network recognition model, and performing the attribute analysis on the plurality of storyboard segments through the network recognition model based on the plurality of preset segment attribute tags, to obtain segment attribute tags corresponding to the plurality of storyboard segments; and determining the plurality of storyboard segments comprising the segment attribute tags as the plurality of video segments of the video data, wherein the performing the storyboard processing comprises: determining a first video frame serving as a cluster centroid in the video sequence through the video partitioning component, and generating storyboard cluster information of a storyboard cluster to which the first video frame belongs; determining video frames other than the first video frame in the video sequence as second video frames, sequentially obtaining each second video frame in the second video frames based on a pooling mechanism, and determining an image similarity between each second video frame and the first video frame; and determining a storyboard cluster to which each video frame in the video sequence belongs based on a result of the image similarity, and forming each video frame in the video sequence into the plurality of storyboard segments based on the storyboard cluster information of the storyboard cluster to which each video frame in the video sequence belongs, and wherein the determining the storyboard cluster to which each video frame in the video sequence belongs comprises: based on the image similarity between the first video frame and a second video frame being greater than or equal to a clustering threshold, allocating the second video frame whose image similarity is greater than or equal to the clustering threshold to the storyboard cluster to which the first video frame belongs; and based on the image similarity between the first video frame and a second video frame being less than the clustering threshold, updating the first video frame using the second video frame whose image similarity is less than the clustering threshold, generating storyboard cluster information of a storyboard cluster to which the updated first video frame belongs, and sequentially performing image similarity matching between the updated first video frame and second video frames that were not previously matched until image similarity matching is performed on each video frame in the video sequence, to obtain storyboard cluster information of a storyboard cluster to which each second video frame in the video sequence belongs. 2. The method according to claim 1 , further comprising, prior to the obtaining the video data of the target video: extracting, in response to receiving a video playing request for the target video from the application client, a video identifier of the target video from the video playing request; and obtaining the video data of the target video by searching a video service database based on the video identifier. 3. The method according to claim 1 , wherein the network recognition model comprises a first network model comprising a first attribute tag extraction function, a second network model comprising a second attribute tag extraction function, and a third network model comprising a third attribute tag extraction function; and the inputting the plurality of storyboard segments into the network recognition model and the performing the attribute analysis comprises: inputting the plurality of storyboard segments into the first network model, performing long shot and close shot analysis on each storyboard segment in the plurality of storyboard segments through the first network model to obtain long shot and close shot tags of the plurality of storyboard segments, using the long shot and close shot tags of the plurality of storyboard segments as a first attribute tag outputted by the first network model, and using storyboard segments comprising the first attribute tag as storyboard segments of a first type; inputting the storyboard segments of the first type into the second network model, and performing face detection on each storyboard segment in the storyboard segments of the first type through the second network model to obtain a face detection result; using, based on the face detection result indicating that a face of a target character exists in the storyboard segments of the first type, storyboard segments corresponding to the face of the target character existing in the storyboard segments of the first type as storyboard segments of a second type, determining a character tag to which the target character in the storyboard segments of the second type belongs through the second network model, and determining the character tag to which the target character belongs as a second attribute tag of the storyboard segments of the second type, wherein the target character is one or more characters in the target video; determining storyboard segments other than the storyboard segments of the second type in the storyboard segments of the first type as storyboard segments of a third type, inputting the storyboard segments of the third type into the third network model, and performing scene detection on each storyboard segment in the storyboard segments of the first type through the third network model to obtain a third attribute tag of the storyboard segments of the third type; and determining a segment attribute tag corresponding to each storyboard segment in the plurality of storyboard segments according to the first attribute tag of the storyboard segments of the first type, the second attribute tag of the storyboard segments of the second type, and the third attribute tag of the storyboard segments of the third type. 4. The method according to claim 1 , wherein the determining the video template and the obtaining the at least one template segment and the template tag sequence in the video template comprises: obtaining a behavior log table of the target user, and extracting behavior data information associated with the target user from

Assignees

Tencent Tech Shenzhen Co Ltd

Inventors

Guo Hui

Classifications

G06V20/46
Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames · CPC title
G06V10/763
Non-hierarchical techniques, e.g. based on statistics of modelling distributions · CPC title
G06V10/761
Proximity, similarity or dissimilarity measures · CPC title
H04N21/485
End-user interface for client configuration · CPC title
H04N21/4826
using recommendation lists, e.g. of programmes or channels sorted out according to their score · CPC title

Patent family

Related publications grouped by family.

View patent family 75047852

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12094209B2 cover?: Embodiments of the disclosure provide a data processing method and apparatus, a device, and a medium. The method includes: performing video analysis on video data of a target video to obtain a plurality of video segments; determining a video template associated with a target user from a video template database based on a user portrait of the target user, and obtaining at least one predetermined…
Who is the assignee on this patent?: Tencent Tech Shenzhen Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V20/49. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).