Learning an autoencoder
US-2020285907-A1 · Sep 10, 2020 · US
US10901740B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10901740-B2 |
| Application number | US-201816636674-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 7, 2018 |
| Priority date | Aug 8, 2017 |
| Publication date | Jan 26, 2021 |
| Grant date | Jan 26, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method for generating realistic depth images by enhancing simulated images rendered from a 3D model, include a rendering engine configured to render noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model, a noise transfer engine configured to apply realistic noise to the noiseless 2.5D images, and a background transfer engine configured to add pseudo-realistic scenedependent backgrounds to the noiseless 2.5D images. The noise transfer engine is configured to learn noise transfer based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to real 2.5D scans generated by a targeted sensor. The background transfer engine is configured to learn background generation based on a processing, by a second GAN, of output data of the first GAN as input data and corresponding real 2.5D scans as target data.
Opening claim text (preview).
What is claimed is: 1. A system for generating realistic depth images by enhancing simulated images rendered from a 3D model, comprising: at least one storage device storing computer-executable instructions configured as one or more modules; and at least one processor configured to access the at least one storage device and execute the instructions, wherein the modules comprise: a rendering engine configured to render noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model; a noise transfer engine configured to apply realistic noise to the noiseless 2.5D images; and a background transfer engine configured to add pseudo-realistic scene-dependent backgrounds to the noiseless 2.5D images; wherein the noise transfer engine is configured to learn noise transfer based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to a real 2.5D scan generated by a targeted sensor; wherein the background transfer engine is configured to learn background generation based on a processing, by a second GAN, of output data of the first GAN as input data and the corresponding real 2.5D scan as target data; and wherein following training of the first and second GANs, a pipeline is formed by a serial coupling of the first and second GANs for runtime operation to process rendered noiseless 2.5D images generated by the rendering engine to produce realistic depth images. 2. The system of claim 1 , wherein learning by the second GAN is refined by processing a dataset of real images from a new target domain to generate backgrounds corresponding with the new target domain. 3. The system of claim 1 , wherein the first GAN and the second GAN comprise: a discriminator network configured as a deep convolutional network with Leaky rectified linear units and an output defined according to a sigmoid activation function, wherein each activation represents a deduction for a patch of input data. 4. The system of claim 3 , wherein the discriminator network of first GAN and the second GAN further comprise: a loss function that executes a binary cross entropy evaluation. 5. The system of claim 4 , wherein the second GAN comprises a loss function configured to compare a generated image with a target image, the loss function edited to heavily penalize any change to foreground by using input data as a binary mask and a Hadamard product. 6. The system of claim 1 , wherein the learning by the noise transfer engine includes cropping backgrounds from images of the real depth scans using the noiseless 2.5D images as masks. 7. The system of claim 1 , wherein the modules further comprise: a depth sensor simulation engine configured to generate a simulated 2.5D scan for each generated noiseless 2.5D image, wherein the learning by the noise transfer engine includes stacking the simulated 2.5D scan and the noiseless 2.5D image into a 2-channel depth image as an input for the noise transfer engine. 8. The system of claim 1 , further comprising: an object recognition network configured to process the realistic depth images as training data to learn object classifications for target objects; wherein following training of the object recognition network, sensor scans may be processed by the trained object recognition network to correlate features of the sensor scan object features for identification of objects in the sensor scan. 9. A method for generating realistic depth images by enhancing simulated images rendered from a 3D model, comprising: rendering noiseless 2.5D images by rendering various poses with respect to a target 3D CAD model; applying, by a noise transfer engine, realistic noise to the noiseless 2.5D images; and adding, by a background transfer engine, pseudo-realistic scene-dependent backgrounds to the noiseless 2.5D images; wherein learning by the noise transfer engine includes a training process based on a mapping, by a first generative adversarial network (GAN), of the noiseless 2.5D images to a real 2.5D scan generated by a targeted sensor; wherein learning by the background transfer engine includes a training process based on a processing, by a second GAN, of output data of the first GAN as input data and the corresponding real 2.5D scan as target data; and wherein following training of the first and second GANs, forming a pipeline by a serial coupling of the first and second GANs for runtime operation to process rendered noiseless 2.5D images generated by the rendering engine to produce realistic depth images. 10. The method of claim 9 , wherein learning by the second GAN is refined by processing a dataset of real images from a new target domain to generate backgrounds corresponding with the new target domain. 11. The method of claim 9 , wherein the first GAN and the second GAN comprise: a discriminator network configured as a deep convolutional network with Leaky rectified linear units and an output defined according to a sigmoid activation function, wherein each activation represents a deduction for a patch of input data. 12. The method of claim 11 , wherein the discriminator network of first GAN and the second GAN further comprise: a loss function that executes a binary cross entropy evaluation. 13. The method of claim 12 , wherein the second GAN comprises a loss function configured to compare a generated image with a target image, the loss function edited to heavily penalize any change to foreground by using input data as a binary mask and a Hadamard product. 14. The method of claim 9 , wherein the learning by the noise transfer engine includes cropping backgrounds from images of the real depth scans using the noiseless 2.5D images as masks. 15. The method of claim 9 , further comprising: generating, by a depth sensor simulation engine, a simulated 2.5D scan for each generated noiseless 2.5D image, wherein the learning by the noise transfer engine includes stacking the simulated 2.5D scan and the noiseless 2.5D image into a 2-channel depth image as an input for the noise transfer engine.
Adversarial learning · CPC title
Transfer learning · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.