Context-aware synthesis and placement of object instances

US12462453B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12462453-B2
Application numberUS-202217585449-A
CountryUS
Kind codeB2
Filing dateJan 26, 2022
Priority dateSep 4, 2018
Publication dateNov 4, 2025
Grant dateNov 4, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a shape of an object. The method further includes inserting the object into the image based on the bounding box and the shape.

First claim

Opening claim text (preview).

What is claimed is: 1 . One or more processors, comprising: circuitry to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 2 . The one or more processors of claim 1 , wherein the circuitry is further to: use the one or more neural networks to generate a transformation corresponding to a bounding box associated with at least one of the set of locations; and insert the object into the image based, at least in part, on the transformation. 3 . The one or more processors of claim 2 , wherein the circuitry is further to apply the transformation to the identified shape. 4 . The one or more processors of claim 1 , wherein the segmentation map comprises a first representation of the image classifying one or more pixels in the image. 5 . The one or more processors of claim 1 , wherein the circuitry is further to: calculate a region of pixels in the image based on the identified shape. 6 . The one or more processors of claim 1 , wherein the one or more neural networks include one or more generator models. 7 . A system, comprising: one or more computers having one or more processors to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 8 . The system of claim 7 , wherein the one or more processors are further to: obtain the segmentation map comprising a representation of the image; process the representation using a first neural network of the one or more neural networks to generate a transformation corresponding to the at least one of the set of locations; and process the transformation and the representation using a second neural network of the one or more neural networks to insert the object. 9 . The system of claim 8 , wherein the representation indicates classifications of pixels of the image. 10 . The system of claim 7 , wherein inserting the object is further based, at least in part, on a size of the object. 11 . The system of claim 7 , wherein the one or more processors are further to identify the at least one of the set of locations through one or more affine transformations, and wherein the one or more affine transformations include at least one of: a translation, scaling, and rotation. 12 . The system of claim 7 , wherein the one or more neural networks are to insert the object based, at least in part, on one or more semantic contexts of the one or more other objects. 13 . A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to use one or more neural networks to: identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 14 . The machine-readable medium of claim 13 , wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to: use the one or more neural networks to process the segmentation map to generate a transformation corresponding to the at least one of the set of locations; and process the transformation using the one or more neural networks to identify the shape of the object. 15 . The machine-readable medium of claim 14 , wherein the segmentation map comprises labels of pixels of the image. 16 . The machine-readable medium of claim 13 , wherein identifying the set of locations is further based, at least in part, on a random vector. 17 . The machine-readable medium of claim 13 , wherein the set of instructions further include instructions, which if performed by the one or more processors, cause the one or more processors to insert the object by at least updating a representation of the image to indicate a label for the object. 18 . The machine-readable medium of claim 13 , wherein the one or more neural networks include a variational autoencoder (VAE). 19 . One or more processors, comprising: circuitry to train one or more neural networks to; identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 20 . The one or more processors of claim 19 , wherein the circuitry is further to: cause the one or more neural networks to predict a transformation representing a bounding box corresponding to the at least one of the set of locations; and train the one or more neural networks based, at least in part, on the predicted transformation. 21 . The one or more processors of claim 20 , wherein the circuitry is further to update parameters of the one or more neural networks using one or more supervised loss functions based on the predicted transformation and ground truth data. 22 . The one or more processors of claim 19 , wherein the one or more neural networks include one or more discriminator models. 23 . The one or more processors of claim 19 , wherein the circuitry is further to train the one or more neural networks using one or more adversarial loss functions. 24 . The one or more processors of claim 19 , wherein the image is a training image for an autonomous driving system. 25 . A method, comprising: training one or more neural networks to; identify a set of locations in an image into which to insert an object based, at least in part, on a segmentation map and a location of one or more other objects within the image; identify a shape of the object based, at least in part, on the set of locations; and insert the object into at least one of the set of locations based, at least in part, on the identified shape. 26 . The method of claim 25 , further comprising: processing the identified shape using one or more discriminators; and training the one or more neural networks based, at least in part, on outputs from the one or more discriminators. 27 . The method of claim 26 , further comprising processing the identified shape using the one or more discriminators by at least causing the one or more discriminators to classify the identified shape as real or fake. 28 . The method of claim 25 , further comprising training the one or more neural networks using one or more unsupervised loss functions. 29 . The method of claim 25 , wher

Assignees

Inventors

Classifications

  • G06T3/02Primary

    Affine transformations (for image registration G06T3/147; for image mosaicing G06T3/4038) · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Classification techniques · CPC title

  • Bounding box · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12462453B2 cover?
One embodiment of a method includes applying a first generator model to a semantic representation of an image to generate an affine transformation, where the affine transformation represents a bounding box associated with at least one region within the image. The method further includes applying a second generator model to the affine transformation and the semantic representation to generate a …
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06T3/02. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).