Classifying image styles of images based on procedural style embeddings
US-2023360362-A1 · Nov 9, 2023 · US
US12586262B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12586262-B1 |
| Application number | US-202418628324-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 5, 2024 |
| Priority date | Apr 5, 2024 |
| Publication date | Mar 24, 2026 |
| Grant date | Mar 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In example embodiments, a visualization application uses geographically relevant style images as guidance to automatically generate realistic 2D renders of a 3D infrastructure model. The application generates a synthetic render of the 3D infrastructure model and retrieves a set of style images that correspond to a geographics position associated with the model. The synthetic 2D render, the set of style images and, optionally, one or more user-provided text guidance phrases and/or mask images are applied to a realistic 2D render generator of the application. The realistic 2D render generator performs image translation (guided by the optional text guidance phrases and/or mask images) to adjust visual appearance of the infrastructure in the synthetic 2D render based on the visual appearance of the set of style images and to generate realistic context based on what appears in the set of style images, thereby producing a realistic 2D render.
Opening claim text (preview).
What is claimed is: 1 . A method for generating a realistic two-dimensional (2D) render of a three-dimensional (3D) infrastructure model, comprising: receiving, by a visualization application executing on one or more computing devices, a request to generate the realistic 2D render of the 3D infrastructure model; determining, by the visualization application, a viewpoint in 3D space of the 3D infrastructure model; generating, by the visualization application, a synthetic 2D render of the 3D infrastructure model from the viewpoint; determining, by the visualization application, a geographic position for the 3D infrastructure model; retrieving, by the visualization application, a set of style images corresponding to the geographic position that show terrain, vegetation, and/or structures; performing, by the visualization application, image translation to adjust visual appearance of infrastructure in the synthetic 2D render based on visual appearance of the set of style images and to generate realistic context based on what appears in the set of style images to produce the realistic 2D render; and outputting, by the visualization application, the realistic 2D render. 2 . The method of claim 1 , wherein the retrieving retrieves the set of style images from an online image database that includes images over a plurality of geographic regions. 3 . The method of claim 1 , further comprising: receiving selection of a set of user-provided style images, wherein the performing image translation also adjusts visual appearance of infrastructure in the synthetic 2D render and generates realistic context based on the set of user-provided style images. 4 . The method of claim 1 , further comprising: receiving one or more user-provided text guidance phrases, wherein the performing image translation also adjusts visual appearance of infrastructure in the synthetic 2D render and generates realistic context based on the one or more user-provided text guidance phrases. 5 . The method of claim 1 , further comprising: receiving, by the visualization application, selection of one or more mask images, wherein the performing image translation also adjusts visual appearance of infrastructure in the synthetic 2D render and generates realistic context based on the one or more mask images. 6 . The method of claim 1 , wherein the determining the viewpoint further comprises: receiving a selection of the viewpoint from a user. 7 . The method of claim 1 , wherein the determining the viewpoint further comprises: generating the viewpoint using a random viewpoint-selection algorithm or a rule-based viewpoint-selection algorithm, and wherein the method further comprises: repeating at least the generating the viewpoint, the generating the synthetic 2D render, and the performing image translation until a stopping condition is met. 8 . The method of claim 1 , wherein the image translation is performed by a trained reverse diffusion machine learning (ML) model and a trained denoising ML model. 9 . The method of claim 8 , wherein the performing image translation further comprises: applying noise to each pixel of the synthetic 2D render to produce an initial noisy image; applying the initial noisy image and the set of style images to the trained reverse diffusion ML model to produce a less noisy image; and applying the less noisy image to the trained denoising ML model to produce the realistic 2D render. 10 . The method of claim 1 , further comprising: generating a location map that relates pixels in the synthetic 2D render to elements from which they were generated in the 3D infrastructure model; determining realistic materials and/or textures for one or more elements of the 3D infrastructure model visible from the viewpoint based on the realistic 2D render and the location map; and updating the 3D infrastructure model to add missing or replace initial materials and/or textures of the one or more elements with the realistic materials and/or textures. 11 . A non-transitory computing device readable medium having instructions stored thereon, the instructions when executed by one or more computing devices operable to: receive a request to generate a realistic two-dimensional (2D) render of a three-dimensional (3D) infrastructure model; determine a viewpoint in 3D space of the 3D infrastructure model; generate a synthetic 2D render of the 3D infrastructure model from the viewpoint; obtain a set of style images that show terrain, vegetation, and/or structures; perform image translation to adjust visual appearance of infrastructure in the synthetic 2D render based on visual appearance of the set of style images and to generate realistic context based on what appears in the set of style images to produce the realistic 2D render; and output the realistic 2D render. 12 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions to obtain the set of style images comprise instructions that when executed are operable to: retrieve the set of style images from an online image database that includes images over a plurality of geographic regions based on a geographic position for the 3D infrastructure model. 13 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions to obtain the set of style images comprise instructions that when executed are operable to: receive a selection of user-provided style images. 14 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions when executed are further operable to: receive one or more user-provided text guidance phrases, wherein the instructions operable to perform image translation are operable to adjust visual appearance of infrastructure in the synthetic 2D render and generate realistic context based on the one or more user-provided text guidance phrases. 15 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions to determine the viewpoint comprise instructions that when executed are operable to: receive a selection of the viewpoint from a user. 16 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions to determine the viewpoint comprise instructions that when executed are operable to: generate the viewpoint using a random viewpoint-selection algorithm or a rule-based viewpoint-selection algorithm. 17 . The non-transitory electronic-device readable medium of claim 11 , wherein the instructions to perform image translation comprise instructions that when executed are operable to: apply noise to each pixel of the synthetic 2D render to produce an initial noisy image; apply the initial noisy image and the set of style images to a trained reverse diffusion machine learning (ML) model to produce a less noisy image; and apply the less noisy image to a trained denoising ML model to produce the realistic 2D render. 18 . A method for assigning materials and/or textures to elements of a three-dimensional (3D) infrastructure model, comprising: receiving, by an application executing on one or more computing devices, a request to determine realistic materials and/or textures of the 3D infrastructure model; determining, by the application, a viewpoint in 3D space of the 3D infrastructure model; generating, by the application, a synthetic 2D render of the 3D infrastructure model from the viewpoint; generating, by the application, a location map that relates pixels in the synthetic 2D render to elements fro
using feature-based methods · CPC title
Earth observation · CPC title
Interactive image processing based on input by user · CPC title
Training; Learning · CPC title
Denoising; Smoothing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.