What technology area does this patent fall under?

Primary CPC classification G06T11/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Semantic image synthesis for generating substantially photorealistic images using neural networks

US2020242774A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2020242774-A1
Application number	US-201916721852-A
Country	US
Kind code	A1
Filing date	Dec 19, 2019
Priority date	Jan 25, 2019
Publication date	Jul 30, 2020
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A user can create a basic semantic layout that includes two or more regions identified by the user, each region being associated with a semantic label indicating a type of object(s) to be rendered in that region. The semantic layout can be provided as input to an image synthesis network. The network can be a trained machine learning network, such as a generative adversarial network (GAN), that includes a conditional, spatially-adaptive normalization layer for propagating semantic information from the semantic layout to other layers of the network. The synthesis can involve both normalization and de-normalization, where each region of the layout can utilize different normalization parameter values. An image is inferred from the network, and rendered for display to the user. The user can change labels or regions in order to cause a new or updated image to be generated.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-readable medium having stored thereon a set of instructions which, if performed by one or more processors, cause the one or more processors to at least: receive one or more semantic inputs; and generate one or more substantially photorealistic images based, at least in part, on the one more semantic inputs using one or more neural networks. 2 . The computer-readable medium of claim 1 , wherein the one or more semantic inputs include at least one region boundary with a semantic label indicating a type of image content to be generated within the at least one region boundary. 3 . The computer-readable medium of claim 2 , wherein the instructions if performed further cause the one or more processors to: generate a semantic layout including the at least one region boundary, wherein the semantic label is modifiable to cause a different type of content to be generated within the region boundary. 4 . The computer-readable medium of claim 3 , wherein the instructions if performed further cause the one or more processors to: generate the type of image content within the region boundary using at least one generative adversarial network (GAN) including a generator and a discriminator. 5 . The computer-readable medium of claim 4 , wherein the GAN has at least one spatially-adaptive normalization layer configured to propagate semantic information throughout other layers of the one or more neural networks. 6 . The computer-readable medium of claim 5 , wherein the instructions if performed further cause the one or more processors to: modulate, by the at least one spatially-adaptive normalization layer, a set of activations through a spatially-adaptive transformation in order to propagate the semantic information throughout the other layers of the one or more neural networks. 7 . A system comprising: one or more processors to receive one or more semantic inputs and to generate one or more substantially photorealistic images based, at least in part, on the one or more semantic inputs using one or more neural networks. 8 . The system of claim 7 , wherein the one or more semantic inputs include at least one region boundary with a semantic label indicating a type of image content to be generated within the region boundary. 9 . The system of claim 8 , wherein the one or more processors are further to generate a semantic layout including the at least one region boundary, wherein the semantic label is modifiable to cause a different type of content to be generated within the region boundary. 10 . The system of claim 9 , wherein the one or more processors are further to generate the type of image content within the region boundary using at least one generative adversarial network (GAN) including a generator and a discriminator. 11 . The system of claim 10 , wherein the GAN has at least one spatially-adaptive normalization layer configured to propagate semantic information throughout other layers of the one or more neural networks. 12 . The system of claim 11 , wherein the one or more processors are further to modulate, by the spatially-adaptive normalization layer, a set of activations through a spatially-adaptive transformation in order to propagate the semantic information throughout the other layers of the one or more neural networks. 13 . A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: receive one or more drawing inputs; and generate one or more substantially photorealistic images based, at least in part, on the one more drawing inputs using one or more neural networks. 14 . The machine-readable medium of claim 13 , wherein the one or more drawing inputs include at least one region boundary with a semantic label indicating a type of image content to be generated within the region boundary. 15 . The machine-readable medium of claim 14 , wherein the instructions if performed further cause the one or more processors to: generate a semantic layout including the at least one region boundary, wherein the semantic label is modifiable to cause a different type of content to be generated within the region boundary. 16 . The machine-readable medium of claim 15 , wherein the instructions if performed further cause the one or more processors to: generate the type of image content within the region boundary using at least one generative adversarial network (GAN) including a generator and a discriminator. 17 . The machine-readable medium of claim 16 , wherein the GAN has at least one spatially-adaptive normalization layer configured to propagate semantic information throughout other layers of the one or more neural networks. 18 . The machine-readable medium of claim 17 , wherein the instructions if performed further cause the one or more processors to: modulate, by the spatially-adaptive normalization layer, a set of activations through a spatially-adaptive transformation in order to propagate the semantic information throughout the other layers of the one or more neural networks. 19 . A system comprising: one or more processors to receive one or more drawing inputs and to generate one or more substantially photorealistic images based, at least in part, on the one or more drawing inputs using one or more neural networks. 20 . The system of claim 19 , wherein the one or more drawing inputs include at least one region boundary with a semantic label indicating a type of image content to be generated within the region boundary. 21 . The system of claim 20 , wherein the one or more processors are further to generate a semantic layout including the at least one region boundary, wherein the semantic label is modifiable to cause a different type of content to be generated within the region boundary. 22 . The system of claim 21 , wherein the one or more processors are further to generate the type of image content within the region boundary using at least one generative adversarial network (GAN) including a generator and a discriminator. 23 . The system of claim 22 , wherein the GAN has at least one spatially-adaptive normalization layer configured to propagate semantic information throughout other layers of the one or more neural networks. 24 . The system of claim 23 , wherein the one or more processors are further to modulate, by the spatially-adaptive normalization layer, a set of activations through a spatially-adaptive transformation in order to propagate the semantic information throughout the other layers of the one or more neural networks. 25 . A machine-readable medium having stored thereon a set of instructions, which performed by one or more processors, cause the one or more processors to at least: receive one or more image inputs; and generate one or more substantially photorealistic images based, at least in part, on the one or more image inputs using one or more neural networks. 26 . The machine-readable medium of claim 25 , wherein the one or more image inputs define at least one region boundary with a semantic label indicating a type of image content to be generated within the region boundary. 27 . The machine-readable medium of claim 26 , wherein the instructions if performed further cause the one or more processors to: generate a semantic layout including the at least one region boundary, w

Assignees

Nvidia Corp

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06T11/10Primary
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0455
Auto-encoder networks; Encoder-decoder networks · CPC title
G06N3/0985
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

View patent family 68944239

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020242774A1 cover?: A user can create a basic semantic layout that includes two or more regions identified by the user, each region being associated with a semantic label indicating a type of object(s) to be rendered in that region. The semantic layout can be provided as input to an image synthesis network. The network can be a trained machine learning network, such as a generative adversarial network (GAN), that …
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06T11/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).