Systems and methods for generating dynamic virtual representations of an object or event
US-2024420395-A1 · Dec 19, 2024 · US
US2026099968A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026099968-A1 |
| Application number | US-202418909025-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 8, 2024 |
| Priority date | Oct 8, 2024 |
| Publication date | Apr 9, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented is disclosed. The method includes: obtaining a composite image depicting an identifiable foreground object; performing image segmentation to isolate a foreground object from the composite image, the image segmentation yielding a segmented composite image; determining object attributes of the foreground object based on image analysis of the segmented composite image; obtaining at least one prompt for a text-to-image model based on the object attributes of the foreground object, the at least one prompt defining an intended replacement background for the composite image; providing, to the text-to-image model, instructions to generate a replacement background image for the composite image based on the at least one prompt; receiving, from the text-to-image model, a replacement background image for the composite image; and generating a new composite image, the generating including compositing portions of the composite image corresponding to the foreground object with the replacement background image.
Opening claim text (preview).
1 . A computer-implemented method, comprising: obtaining a composite image depicting an identifiable foreground object; performing image segmentation to isolate a foreground object from the composite image, the image segmentation yielding a segmented composite image; determining object attributes of the foreground object based on image analysis of the segmented composite image; obtaining at least one prompt for a text-to-image model based on the object attributes of the foreground object, the at least one prompt defining an intended replacement background for the composite image; providing, to the text-to-image model, instructions to generate a replacement background image for the composite image based on the at least one prompt; receiving, from the text-to-image model, a replacement background image for the composite image; and generating a new composite image, the generating including compositing portions of the composite image corresponding to the foreground object with the replacement background image. 2 . The method of claim 1 , wherein obtaining the at least one prompt for the text-to-image model comprises: determining current text input in an input field of a graphical user interface; providing, to a large language model (LLM), instructions to generate predictive text data based on the object attributes of the foreground object and the current text input in the input field; and presenting the predictive text data via the graphical user interface. 3 . The method of claim 2 , wherein the predictive text data comprises a next word suggestion. 4 . The method of claim 2 , further comprising: receiving, via the graphical user interface, user selection of first text from the predictive text data; and combining the current text input in the input field with the selected first text to obtain the at least one prompts. 5 . The method of claim 2 , wherein the predictive text data comprises a plurality of word suggestions and wherein each of the plurality of word suggestions is presented as a selectable option via the graphical user interface. 6 . The method of claim 1 , wherein determining the object attributes of the foreground object comprises providing at least a portion of the segmented composite image to a language model that is trained to output attribute labels of input images. 7 . The method of claim 6 , wherein the language model is fine-tuned using a dataset of images depicting first objects and attribute labels associated with the first objects. 8 . The method of claim 2 , further comprising: computing input embeddings using the object attributes of the foreground object and the current text input; and determining a ranking of outputs of the LLM based on a similarity measure between embeddings of predictive text candidates and the input embeddings. 9 . The method of claim 1 , further comprising extracting the foreground object from the segmented composite image, wherein the new composite image is generated based on combining the extracted foreground object with the replacement background image. 10 . The method of claim 1 , wherein obtaining the at least one prompt for the text-to-image model comprises providing, to an LLM, instructions to generate one or more candidate prompts based on the object attributes of the foreground object, and wherein providing instructions to the text-to-image model to generate the replacement background image comprises: receiving, from the LLM, the generated candidate prompts; and for each generated candidate prompt, providing, to the text-to-image model, instructions to generate a candidate background image corresponding to the candidate prompt. 11 . The method of claim 10 , further comprising presenting, via a graphical user interface, the candidate background images corresponding to the one or more candidate prompts as selectable options, wherein the replacement background image comprises a selection of one of the candidate background images. 12 . The method of claim 10 , further comprising: receiving, via the graphical user interface, a selection of one of the candidate background images and user input of modifications to the candidate prompt associated with the selected candidate background image; providing, to the text-to-image model, instructions to generate a modified candidate background image based on the modified candidate prompt. 13 . The method of claim 1 , wherein performing the image segmentation comprises identifying a class of the foreground object using a classifier. 14 . A computing system, comprising: a processor; and a memory coupled to the processor, the memory storing computer-executable instructions that, when executed by the processor, configure the processor to: obtain a composite image depicting an identifiable foreground object; perform image segmentation to isolate a foreground object from the composite image, the image segmentation yielding a segmented composite image; determine object attributes of the foreground object based on image analysis of the segmented composite image; obtain at least one prompt for a text-to-image model based on the object attributes of the foreground object, the at least one prompt defining an intended replacement background for the composite image; provide, to the text-to-image model, instructions to generate a replacement background image for the composite image based on the at least one prompt; receive, from the text-to-image model, a replacement background image for the composite image; and generate a new composite image, the generating including compositing portions of the composite image corresponding to the foreground object with the replacement background image. 15 . The computing system of claim 14 , wherein the instructions, when executed, further configure the processor to: determine current text input in an input field of a graphical user interface; provide, to a large language model (LLM), instructions to generate predictive text data based on the object attributes of the foreground object and the current text input in the input field; and present the predictive text data via the graphical user interface. 16 . The computing system of claim 15 , wherein the instructions, when executed, further configure the processor to: receive, via the graphical user interface, user selection of first text from the predictive text data; and combine the current text input in the input field with the selected first text to obtain the at least one prompts. 17 . The computing system of claim 15 , wherein the instructions, when executed, further configure the processor to: compute input embeddings using the object attributes of the foreground object and the current text input; and determine a ranking of outputs of the LLM based on a similarity measure between embeddings of predictive text candidates and the input embeddings. 18 . The computing system of claim 14 , wherein obtaining the at least one prompt for the text-to-image model comprises providing, to an LLM, instructions to generate one or more candidate prompts based on the object attributes of the foreground object, and wherein providing instructions to the text-to-image model to generate the replacement background image comprises: receiving, from the LLM, the generated candidate prompts; and for each generated candidate prompt, providing, to the text-to-image model, instructions to generate a candidate background image corresponding to the candidate prompt. 19 . The computing system of claim 18 , wherein the instructio
using classification, e.g. of video objects · CPC title
involving graphical user interfaces [GUIs] · CPC title
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
Image combination · CPC title
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.