Systems and methods for generating dynamic virtual representations of an object or event
US-2024420395-A1 · Dec 19, 2024 · US
US2026051096A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026051096-A1 |
| Application number | US-202519299718-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 14, 2025 |
| Priority date | Aug 15, 2024 |
| Publication date | Feb 19, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the present disclosure provide a video editing method, apparatus, device, medium and program product. The method includes: inputting an original video into a script generation model that is pre-trained; generating, by the script generation model, a video feature sequence according to video frames in the original video, mapping the video feature sequence to a text feature space of the script generation mode, and obtaining a video mapping feature sequence; generating, by the script generation model, a second video script of the original video based on the video mapping feature sequence, wherein the second video script includes a timestamp; adding the second video script to the original video according to the timestamp to obtain a target video
Opening claim text (preview).
1 . A video editing method, comprising: inputting an original video into a script generation model that is pre-trained, wherein the script generation model is trained based on video samples, and first video scripts of the video samples satisfy a preset selection condition; generating, by the script generation model, a video feature sequence according to video frames in the original video, mapping the video feature sequence to a text feature space of the script generation mode, and obtaining a video mapping feature sequence; generating, by the script generation model, a second video script of the original video based on the video mapping feature sequence, wherein the second video script comprises a timestamp; adding the second video script to the original video according to the timestamp to obtain a target video. 2 . The method according to claim 1 , wherein the generating, by the script generation model, a second video script of the original video based on the video mapping feature sequence comprises: inputting prompt information corresponding to the original video into the script generation model; generating, by the script generation model, the second video script of the original video based on the prompt information and the video mapping feature sequence. 3 . The method according to claim 2 , wherein the script generation model comprises a script generation module, and parameters of the script generation module are updated during training process of the script generation model; the generating, by the script generation model, the second video script of the original video based on the prompt information and the video mapping feature sequence comprises: generating, by the script generation module, the second video script of the original video based on the video mapping feature sequence under constraint of the prompt information, wherein the prompt information comprises script attribute prompt information and/or script content prompt information. 4 . The method according to claim 2 , wherein a training method of the script generation model comprises: obtaining the video samples, wherein the first video scripts of the video samples satisfy the preset selection condition; determining a script sample according to audio information of video frames in the video samples, and generating a prompt information sample based on attribute information and content information of the script sample; training a script generation model to be trained based on the video frames in the video samples, the prompt information sample, and the script sample, so that the script generation model to be trained learns a mapping relationship among the video frames in the video samples, the prompt information sample, and the script sample. 5 . The method according to claim 4 , wherein the training a script generation model to be trained based on the video frames in the video samples, the prompt information sample, and the script sample comprises: inputting the video frames in the video samples and corresponding prompt information samples into the script generation model to be trained, and obtaining a predicted script output by the script generation model to be trained; calculating a loss value between the predicted script and the script sample; in response to the loss value not satisfying a model training end condition, adopting a backpropagation method to adjust model parameters of the script generation model to be trained. 6 . The method according to claim 1 , wherein the script generation model comprises a visual encoder and an adapter, and parameters of the adapter are updated during training of the script generation model; the generating, by the script generation model, a video feature sequence according to video frames in the original video, mapping the video feature sequence to a text feature space of the script generation mode, and obtaining a video mapping feature sequence comprises: compressing the video frames in the original video by the visual encoder to obtain a feature vector, and generating the video feature sequence according to the feature vector corresponding to the video frames; mapping the video feature sequence to the text feature space by the adapter to obtain the video mapping feature sequence. 7 . The method according to claim 1 , wherein the adding the second video script to the original video according to the timestamp to obtain a target video comprises: determining a video frame corresponding to the second video script according to a timestamp corresponding to the second video script, adding the second video script to the video frame corresponding to the second video script, and obtaining the target video. 8 . An electronic device, comprising: one or more processors; a storage apparatus, configured to store one or more programs, wherein when the one or more programs are executed by the one or more processor, the one or more processors are caused to implement a video editing method, and the method comprises: inputting an original video into a script generation model that is pre-trained, wherein the script generation model is trained based on video samples, and first video scripts of the video samples satisfy a preset selection condition; generating, by the script generation model, a video feature sequence according to video frames in the original video, mapping the video feature sequence to a text feature space of the script generation mode, and obtaining a video mapping feature sequence; generating, by the script generation model, a second video script of the original video based on the video mapping feature sequence, wherein the second video script comprises a timestamp; adding the second video script to the original video according to the timestamp to obtain a target video. 9 . The electronic device according to claim 8 , wherein the generating, by the script generation model, a second video script of the original video based on the video mapping feature sequence comprises: inputting prompt information corresponding to the original video into the script generation model; generating, by the script generation model, the second video script of the original video based on the prompt information and the video mapping feature sequence. 10 . The electronic device according to claim 9 , wherein the script generation model comprises a script generation module, and parameters of the script generation module are updated during training process of the script generation model; the generating, by the script generation model, the second video script of the original video based on the prompt information and the video mapping feature sequence comprises: generating, by the script generation module, the second video script of the original video based on the video mapping feature sequence under constraint of the prompt information, wherein the prompt information comprises script attribute prompt information and/or script content prompt information. 11 . The electronic device according to claim 9 , wherein a training method of the script generation model comprises: obtaining the video samples, wherein the first video scripts of the video samples satisfy the preset selection condition; determining a script sample according to audio information of video frames in the video samples, and generating a prompt information sample based on attribute information and content information of the script sample; training a script generation model to be trained based on the video frames in the video samples, the prompt information sample, and the script sample, so that the script generation model to be trained learns a mapping relationship among the video frames in the video samples, the prompt
Creating or editing images; Combining images with text · CPC title
Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods · CPC title
Learning methods · CPC title
Generation or processing of descriptive data, e.g. content descriptors {(systems specially adapted for using meta-information in broadcast systems H04H60/73)} · CPC title
Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.