Method, apparatus, device and storage medium for video generation

US12524940B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12524940-B2
Application numberUS-202418622479-A
CountryUS
Kind codeB2
Filing dateMar 29, 2024
Priority dateApr 23, 2023
Publication dateJan 13, 2026
Grant dateJan 13, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure provides a method, an apparatus, a device and a storage medium for video generation. The method comprises: obtaining first text information used to describe a video effect requirement; obtaining at least one multimedia material; and generating a target video based on the first text information and the at least one multimedia material. The at least one multimedia material is presented in the target video. A video effect of the target video meets the video effect requirement described in the first text information. The target video is used to present a combination of at least one video segment. The at least one video segment is formed respectively based on respective video-image materials in the at least one multimedia material. The respective video-image materials comprise a video material and/or an image material.

First claim

Opening claim text (preview).

We claim: 1 . A method of video generation, comprising: displaying an interface configured to display a plurality of multimedia materials and to enable a selection of at least one multimedia material from the plurality of multimedia materials; in response to selecting at least two multimedia materials, displaying an input box on the same interface via which the at least two multimedia materials are selected, wherein the input box is configured to receive text information indicating a requirement to be met by a target video to be generated; extracting a feature label from the requirement indicated by first text information received via the input box; comparing the feature label of the first text information with templates in a template library and determining a first video editing template that matches the feature label of the first text information; extracting feature labels from the at least two multimedia materials; comparing the feature labels of the at least two multimedia materials with the templates in the template library and determining a second video editing template that matches the feature labels of the at least two multimedia materials; displaying the first video editing template and the second video editing template on a preview page; generating a new video editing template based on the first video editing template and the second video editing template; and generating the target video based on the new video editing template, the first text information received via the input box, and the at least two multimedia materials, wherein the target video is a new video, wherein the target video meets the requirement indicated by the first text information, the target video presents a combination of at least one video segment, the at least one video segment is formed respectively based on respective video-image materials in the at least two multimedia materials, and the respective video-image materials comprise at least one of video materials or image materials. 2 . The method of claim 1 , wherein the generating the target video comprises: generating a video editing draft based on the first text information and the at least two multimedia materials, wherein the video editing draft comprises the at least two multimedia materials and editing information, the editing information is used to indicate an editing operation for the at least two multimedia materials, and the editing operation is at least used to edit the respective video-image materials in the at least two multimedia materials respectively into the at least one video segment, a video editing effect and/or the at least one multimedia material corresponding to the editing operation meet the video effect requirement described in the first text information; and generating the target video based on the video editing draft. 3 . The method of claim 2 , wherein the generating a video editing draft based on the first text information and the at least two multimedia materials comprises: determining at least one video editing template based on the first text information and the at least two multimedia materials, wherein the editing effect of the at least one video editing template meets the video effect requirement described in the first text information; and applying an editing operation indicated by a target video editing template from the at least one video editing template to the at least two multimedia materials, to generate the video editing draft. 4 . The method of claim 3 , wherein after the determining at least one video editing template based on the first text information and the at least two multimedia materials, and the method further comprises: selecting a third video editing template from the at least one video editing template and presenting the third video editing template on a preview page for a video editing effect, so that the preview page is used to preview the video effect obtained by importing the at least one multimedia material into the third video editing template, and the preview page being configured with an update recommendation control; and in response to a trigger operation for the update recommendation control, selecting a fourth video editing template in the at least one video editing template, and replacing the third video editing template presented on the preview page with the fourth video editing template, so that the preview page is used to preview the video effect obtained by importing the at least one multimedia material into the fourth video editing template. 5 . The method of claim 3 , wherein after the determining at least one video editing template based on the first text information and the two multimedia materials, the method further comprises: displaying, on a preview page, a fifth video editing template in the at least one video editing template; obtaining adjusted text information, in response to a text adjustment operation on the preview page for the first text information; determining a second video editing template set based on the adjusted text information and the at least one multimedia material; and replacing the fifth video editing template displayed on the preview page with a sixth video editing template in the second video editing template set. 6 . The method of claim 5 , wherein before the determining a second video editing template set based on the adjusted text information and the at least one multimedia material, the method further comprises: receiving a material adjustment operation for the at least one multimedia material to obtain an adjusted multimedia material; and wherein the determining a second video editing template set based on the adjusted text information and the at least one multimedia material comprises determining the second video editing template set based on the adjusted text information and the adjusted multimedia material. 7 . The method of claim 1 , wherein obtaining the at least two multimedia materials comprises: determining a matching first multimedia material from a user material set based on an analysis result for the first text information; or generating a second multimedia material based on the analysis result for the first text information. 8 . The method of claim 1 , wherein the method further comprises: displaying the input box in response to an importing operation for the at least two multimedia materials. 9 . The method of claim 1 , wherein the method further comprises: displaying at least one video label, wherein the video label is configured to characterize the requirement of the target video to be generated; and obtaining the first text information based on an operation of adding a target video label in the at least one video label to the input box. 10 . A non-transitory computer-readable storage medium, wherein the computer-readable storage medium stores an instruction that, when executed on a terminal device, causes the terminal device to implement operations comprising: displaying an interface configured to display a plurality of multimedia materials and to enable a selection of at least one multimedia material from the plurality of multimedia materials; in response to selecting at least two multimedia materials, displaying an input box on the same interface via which the at least two multimedia materials are selected, wherein the input box is configured to receive text information indicating a requirement to be met by a target video to be generated; extracting a feature label from the requirement indicated by first text information received via the input box; comparing the feature label of the first text information with templates in a template library and determining a first video editing template that matches the f

Assignees

Inventors

Classifications

  • G11B27/031Primary

    Electronic editing of digitised analogue information signals, e.g. audio or video signals · CPC title

  • Indicating arrangements  {(indicating means incorporated in magazine or cassette G11B23/046 and G11B23/0875; indicating measured values in general G01D)} · CPC title

  • Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs · CPC title

  • Processing of audio elementary streams · CPC title

  • Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects {; Cameras specially adapted for the electronic generation of special effects} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12524940B2 cover?
The disclosure provides a method, an apparatus, a device and a storage medium for video generation. The method comprises: obtaining first text information used to describe a video effect requirement; obtaining at least one multimedia material; and generating a target video based on the first text information and the at least one multimedia material. The at least one multimedia material is prese…
Who is the assignee on this patent?
Beijing Zitiao Network Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G11B27/031. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 13 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).