Immersive media content presentation and interactive 360° video communication
US-2024323337-A1 · Sep 26, 2024 · US
US2026089366A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026089366-A1 |
| Application number | US-202418901944-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 30, 2024 |
| Priority date | Sep 26, 2024 |
| Publication date | Mar 26, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a new video from a source video includes determining that the source video is associated with one or more components, and identifying a source starting segment within the source video at least in part by selecting a segment identification model, from among a plurality of candidate segment identification models, based at least in part on the segment identification module being configured to operate upon at least one of the one or more components. The method also includes identifying the source starting segment by using the selected segment identification model to process at least a portion of the source video. The method also includes generating the new video using one or more portions of the source video, wherein generating the new video includes generating an initial segment of the new video based on the source starting segment.
Opening claim text (preview).
What is claimed is: 1 . A method for generating a new video from a source video, the method comprising: determining, by one or more processors, that the source video is associated with one or more components; identifying, by the one or more processors, a source starting segment within the source video, at least in part by: selecting a segment identification model, from among a plurality of candidate segment identification models, based at least in part on the segment identification model being configured to operate upon at least one of the one or more components; and identifying the source starting segment by using the selected segment identification model to process at least a portion of the source video; and generating, by the one or more processors, the new video using one or more portions of the source video, wherein generating the new video includes generating an initial segment of the new video based on the source starting segment. 2 . The method of claim 1 , wherein: determining that the source video is associated with the one or more components includes determining that the source video is associated with a speech component; selecting the segment identification model includes selecting a first machine learning model, the first machine learning model including a large language model; and identifying the source starting segment includes applying a prompt, and a transcript of at least a portion of the speech component, to the first machine learning model. 3 . The method of claim 2 , wherein identifying the source starting segment includes outputting, by the first machine learning model, an indication of text corresponding to the source starting segment. 4 . The method of claim 1 , wherein: selecting the segment identification model includes selecting a first machine learning model; and identifying the source starting segment includes applying at least a portion of audio, and video frames, of the source video to the first machine learning model. 5 . The method of claim 4 , wherein identifying the source starting segment includes outputting, by the first machine learning model, an indication of a source starting audio segment or a source starting video segment. 6 . The method of claim 1 , wherein: selecting the segment identification model includes selecting a first machine learning model; and identifying the source starting segment includes applying a predetermined portion of the source video to the first machine learning model, the predetermined portion being entirely within a time window that is between a last 20 seconds of the source video and a last 5 seconds of the source video. 7 . The method of claim 1 , wherein generating the new video includes causing the initial segment of the new video to begin at the source starting segment and continue until an end of the source video with a same sequence as the source video. 8 . The method of claim 1 , wherein generating the new video includes: shifting a start of the source starting segment to a point corresponding to a boundary between adjacent words; and causing the initial segment of the new video to begin at the shifted start of the source video. 9 . The method of claim 1 , wherein generating the new video includes: shifting a start of the source starting segment to a point corresponding to a boundary between adjacent scenes; and causing the initial segment of the new video to begin at the shifted start of the source video. 10 . The method of claim 1 , wherein the plurality of candidate segment identification models includes a plurality of machine learning models, and wherein identifying the source starting segment within the source video includes: for each machine learning model of the plurality of machine learning models, identifying a respective candidate starting segment by applying at least a portion of (i) the source video, or (ii) a transcript of a speech component of the source video, to the machine learning model, and predicting, using an additional machine learning model, a respective performance metric associated with the respective candidate starting segment; and identifying the source starting segment based on the respective performance metrics for the plurality of machine learning models. 11 . A system comprising: one or more processors; and one or more memories storing instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising: determining that a source video is associated with one or more components; identifying a source starting segment within the source video, at least in part by (i) selecting a segment identification model, from among a plurality of candidate segment identification models, based at least in part on the segment identification model being configured to operate upon at least one of the one or more components, and (ii) identifying the source starting segment by using the selected segment identification model to process at least a portion of the source video; and generating a new video using one or more portions of the source video, wherein generating the new video includes generating an initial segment of the new video based on the source starting segment. 12 . The system of claim 11 , wherein identifying the source starting segment includes: determining that the source video is associated with the one or more components includes determining that the source video is associated with a speech component; selecting the segment identification model includes selecting a first machine learning model, the first machine learning model including a large language model; and identifying the source starting segment includes applying a prompt, and a transcript of at least a portion of the speech component, to the first machine learning model. 13 . The system of claim 12 , wherein identifying the source starting segment includes outputting, by the first machine learning model, an indication of text corresponding to the source starting segment. 14 . The system of claim 11 , wherein: selecting the segment identification model includes selecting a first machine learning model; and identifying the source starting segment includes applying at least a portion of audio and video frames of the source video to the first machine learning model. 15 . The system of claim 14 , wherein identifying the source starting segment includes outputting, by the first machine learning model, an indication of a source starting audio segment or a source starting video segment. 16 . The system of claim 11 , wherein: selecting the segment identification model includes selecting a first machine learning model; and identifying the source starting segment includes applying a predetermined portion of the source video to the first machine learning model, the predetermined portion being entirely within a time window that is between a last 20 seconds of the source video and a last 5 seconds of the source video. 17 . The system of claim 11 , wherein generating the new video includes causing the initial segment of the new video to begin at the source starting segment and continue until an end of the source video with a same sequence as the source video. 18 . The system of claim 11 , wherein generating the new video includes: shifting a start of the source starting segment to a point corresponding to a boundary between adjacent words; and causing the initial segment of the new video to begin at the shifted start of the source video. 19 .
involving splicing one content stream with another content stream, e.g. for substituting a video clip · CPC title
by decomposing the content in the time domain, e.g. in time segments · CPC title
involving special video data, e.g 3D video · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.