Synthetic depth image generation from cad data using generative adversarial neural networks for enhancement

US10901740B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10901740-B2
Application numberUS-201816636674-A
CountryUS
Kind codeB2
Filing dateAug 7, 2018
Priority dateAug 8, 2017
Publication dateJan 26, 2021
Grant dateJan 26, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for generating realistic depth images by enhancing simulated images rendered from a 3D model, include a rendering engine configured to render noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model, a noise transfer engine configured to apply realistic noise to the noiseless 2.5D images, and a background transfer engine configured to add pseudo-realistic scenedependent backgrounds to the noiseless 2.5D images. The noise transfer engine is configured to learn noise transfer based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to real 2.5D scans generated by a targeted sensor. The background transfer engine is configured to learn background generation based on a processing, by a second GAN, of output data of the first GAN as input data and corresponding real 2.5D scans as target data.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for generating realistic depth images by enhancing simulated images rendered from a 3D model, comprising: at least one storage device storing computer-executable instructions configured as one or more modules; and at least one processor configured to access the at least one storage device and execute the instructions, wherein the modules comprise: a rendering engine configured to render noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model; a noise transfer engine configured to apply realistic noise to the noiseless 2.5D images; and a background transfer engine configured to add pseudo-realistic scene-dependent backgrounds to the noiseless 2.5D images; wherein the noise transfer engine is configured to learn noise transfer based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to a real 2.5D scan generated by a targeted sensor; wherein the background transfer engine is configured to learn background generation based on a processing, by a second GAN, of output data of the first GAN as input data and the corresponding real 2.5D scan as target data; and wherein following training of the first and second GANs, a pipeline is formed by a serial coupling of the first and second GANs for runtime operation to process rendered noiseless 2.5D images generated by the rendering engine to produce realistic depth images. 2. The system of claim 1 , wherein learning by the second GAN is refined by processing a dataset of real images from a new target domain to generate backgrounds corresponding with the new target domain. 3. The system of claim 1 , wherein the first GAN and the second GAN comprise: a discriminator network configured as a deep convolutional network with Leaky rectified linear units and an output defined according to a sigmoid activation function, wherein each activation represents a deduction for a patch of input data. 4. The system of claim 3 , wherein the discriminator network of first GAN and the second GAN further comprise: a loss function that executes a binary cross entropy evaluation. 5. The system of claim 4 , wherein the second GAN comprises a loss function configured to compare a generated image with a target image, the loss function edited to heavily penalize any change to foreground by using input data as a binary mask and a Hadamard product. 6. The system of claim 1 , wherein the learning by the noise transfer engine includes cropping backgrounds from images of the real depth scans using the noiseless 2.5D images as masks. 7. The system of claim 1 , wherein the modules further comprise: a depth sensor simulation engine configured to generate a simulated 2.5D scan for each generated noiseless 2.5D image, wherein the learning by the noise transfer engine includes stacking the simulated 2.5D scan and the noiseless 2.5D image into a 2-channel depth image as an input for the noise transfer engine. 8. The system of claim 1 , further comprising: an object recognition network configured to process the realistic depth images as training data to learn object classifications for target objects; wherein following training of the object recognition network, sensor scans may be processed by the trained object recognition network to correlate features of the sensor scan object features for identification of objects in the sensor scan. 9. A method for generating realistic depth images by enhancing simulated images rendered from a 3D model, comprising: rendering noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model; applying, by a noise transfer engine, realistic noise to the noiseless 2.5D images; and adding, by a background transfer engine, pseudo-realistic scene-dependent backgrounds to the noiseless 2.5D images; wherein learning by the noise transfer engine includes a training process based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to a real 2.5D scan generated by a targeted sensor; wherein learning by the background transfer engine includes a training process based on a processing, by a second GAN, of output data of the first GAN as input data and the corresponding real 2.5D scan as target data; and wherein following training of the first and second GANs, forming a pipeline by a serial coupling of the first and second GANs for runtime operation to process rendered noiseless 2.5D images generated by the rendering engine to produce realistic depth images. 10. The method of claim 9 , wherein learning by the second GAN is refined by processing a dataset of real images from a new target domain to generate backgrounds corresponding with the new target domain. 11. The method of claim 9 , wherein the first GAN and the second GAN comprise: a discriminator network configured as a deep convolutional network with Leaky rectified linear units and an output defined according to a sigmoid activation function, wherein each activation represents a deduction for a patch of input data. 12. The method of claim 11 , wherein the discriminator network of first GAN and the second GAN further comprise: a loss function that executes a binary cross entropy evaluation. 13. The method of claim 12 , wherein the second GAN comprises a loss function configured to compare a generated image with a target image, the loss function edited to heavily penalize any change to foreground by using input data as a binary mask and a Hadamard product. 14. The method of claim 9 , wherein the learning by the noise transfer engine includes cropping backgrounds from images of the real depth scans using the noiseless 2.5D images as masks. 15. The method of claim 9 , further comprising: generating, by a depth sensor simulation engine, a simulated 2.5D scan for each generated noiseless 2.5D image, wherein the learning by the noise transfer engine includes stacking the simulated 2.5D scan and the noiseless 2.5D image into a 2-channel depth image as an input for the noise transfer engine.

Assignees

Inventors

Classifications

  • Adversarial learning · CPC title

  • Transfer learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10901740B2 cover?
A system and method for generating realistic depth images by enhancing simulated images rendered from a 3D model, include a rendering engine configured to render noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model, a noise transfer engine configured to apply realistic noise to the noiseless 2.5D images, and a background transfer engine configured to add pseudo…
Who is the assignee on this patent?
Siemens Ag
What technology area does this patent fall under?
Primary CPC classification G06F9/328. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 26 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).