What technology area does this patent fall under?

Primary CPC classification G06T3/02. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Context-aware synthesis and placement of object instances

US12462453B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12462453-B2
Application number	US-202217585449-A
Country	US
Kind code	B2
Filing date	Jan 26, 2022
Priority date	Sep 4, 2018
Publication date	Nov 4, 2025
Grant date	Nov 4, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.

First claim

Opening claim text (preview).

What is claimed is: 1 . One or more processors, comprising: circuitry to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 2 . The one or more processors of claim 1 , wherein the circuitry is further to: use the one or more neural networks to generate a transformation corresponding to a bounding box associated with at least one of the set of locations; and insert the object into the image based, at least in part, on the transformation. 3 . The one or more processors of claim 2 , wherein the circuitry is further to apply the transformation to the identified shape. 4 . The one or more processors of claim 1 , wherein the segmentation map comprises a first representation of the image classifying one or more pixels in the image. 5 . The one or more processors of claim 1 , wherein the circuitry is further to: calculate a region of pixels in the image based on the identified shape. 6 . The one or more processors of claim 1 , wherein the one or more neural networks include one or more generator models. 7 . A system, comprising: one or more computers having one or more processors to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 8 . The system of claim 7 , wherein the one or more processors are further to: obtain the segmentation map comprising a representation of the image; process the representation using a first neural network of the one or more neural networks to generate a transformation corresponding to the at least one of the set of locations; and process the transformation and the representation using a second neural network of the one or more neural networks to insert the object. 9 . The system of claim 8 , wherein the representation indicates classifications of pixels of the image. 10 . The system of claim 7 , wherein inserting the object is further based, at least in part, on a size of the object. 11 . The system of claim 7 , wherein the one or more processors are further to identify the at least one of the set of locations through one or more affine transformations, and wherein the one or more affine transformations include at least one of: a translation, scaling, and rotation. 12 . The system of claim 7 , wherein the one or more neural networks are to insert the object based, at least in part, on one or more semantic contexts of the one or more other objects. 13 . A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 14 . The machine-readable medium of claim 13 , wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to: use the one or more neural networks to process the segmentation map to generate a transformation corresponding to the at least one of the set of locations; and process the transformation using the one or more neural networks to identify the shape of the object. 15 . The machine-readable medium of claim 14 , wherein the segmentation map comprises labels of pixels of the image. 16 . The machine-readable medium of claim 13 , wherein identifying the set of locations is further based, at least in part, on a random vector. 17 . The machine-readable medium of claim 13 , wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to insert the object by at least updating a representation of the image to indicate a label for the object. 18 . The machine-readable medium of claim 13 , wherein the one or more neural networks include a variational autoencoder (VAE). 19 . One or more processors, comprising: circuitry to train one or more neural networks to; identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 20 . The one or more processors of claim 19 , wherein the circuitry is further to: cause the one or more neural networks to predict a transformation representing a bounding box corresponding to the at least one of the set of locations; and train the one or more neural networks based, at least in part, on the predicted transformation. 21 . The one or more processors of claim 20 , wherein the circuitry is further to update parameters of the one or more neural networks using one or more supervised loss functions based on the predicted transformation and ground truth data. 22 . The one or more processors of claim 19 , wherein the one or more neural networks include one or more discriminator models. 23 . The one or more processors of claim 19 , wherein the circuitry is further to train the one or more neural networks using one or more adversarial loss functions. 24 . The one or more processors of claim 19 , wherein the image is a training image for an autonomous driving system. 25 . A method, comprising: training one or more neural networks to; identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 26 . The method of claim 25 , further comprising: processing the identified shape using one or more discriminators; and training the one or more neural networks based, at least in part, on outputs from the one or more discriminators. 27 . The method of claim 26 , further comprising processing the identified shape using the one or more discriminators by at least causing the one or more discriminators to classify the identified shape as real or fake. 28 . The method of claim 25 , further comprising training the one or more neural networks using one or more unsupervised loss functions. 29 . The method of claim 25 , wher

Assignees

Nvidia Corp

Inventors

Classifications

G06T3/02Primary
Affine transformations (for image registration G06T3/147; for image mosaicing G06T3/4038) · CPC title
G06F18/217
Validation; Performance evaluation; Active pattern learning techniques · CPC title
G06F18/24
Classification techniques · CPC title
G06T2210/12
Bounding box · CPC title
G06T2207/20084
Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

View patent family 69526529

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12462453B2 cover?: One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a …
Who is the assignee on this patent?: Nvidia Corp
What technology area does this patent fall under?: Primary CPC classification G06T3/02. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).