Who is the assignee on this patent?

Mitsubishi Electric Res Laboratories Inc

What technology area does this patent fall under?

Primary CPC classification G06V10/454. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for generating multimodal digital images

US9971958B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9971958-B2
Application number	US-201615189075-A
Country	US
Kind code	B2
Filing date	Jun 22, 2016
Priority date	Jun 1, 2016
Publication date	May 15, 2018
Grant date	May 15, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method generates a multimodal digital image by processing a vector with a first neural network to produce a first modality of the digital image and processing the vector with a second neural network to produce a second modality of the digital image. A structure and a number of layers of the first neural network are identical to a structure and a number of layers of the second neural network. Also, at least one layer in the first neural network has parameters identical to parameters of a corresponding layer in the second neural network, and at least one layer in the first neural network has parameters different from parameters of a corresponding layer in the second neural network.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method for generating a multimodal digital image, wherein the method uses a processor coupled with stored instructions implementing the method, wherein the instructions, when executed by the processor carry out steps of the method comprising: acquiring an image of a scene indicative of features of the scene; processing the image with a first neural network to produce a first image having a first modality; image; processing the image with a second neural network to produce a second image having a second modality, such that the first image and the second image form the multimodal digital image, the wherein a structure and a number of layers of the first neural network are identical to a structure and a number of layers of the second neural network, wherein at least one layer in the first neural network has parameters identical to parameters of a corresponding layer in the second neural network, and wherein at least one layer in the first neural network has parameters different from parameters of a corresponding layer in the second neural network, wherein the layers of the first and the second neural networks having identical parameters produce high-level features of the first and the second images of the multimodal digital image, and wherein the layers of the first and the second neural networks having different parameters produce low-level features of the first and the second images of the multimodal digital image, wherein the first neural network and the second neural network are trained jointly while enforcing identical parameters for several bottom layers of the first neural network and the second neural network, wherein at least one or both of the first neural network and the second neural network are trained using generative adversarial nets (GAN) including a generative subnetwork for producing a sample of the digital image of a specific modality and a discriminative subnetwork for testing if the sample of the digital image produced by the generative subnetwork has the specific modality; and outputting the multimodal digital image. 2. The method of claim 1 , further comprising: randomly generating elements of the image using a probabilistic distribution. 3. The method of claim 1 wherein the low-level features are derived from the high level features. 4. The method of claim 1 , wherein the digital image includes one or combination of an image, a video, a text, and a sound. 5. The method of claim 1 , wherein a first generative subnetwork and a first discriminative subnetwork of the first neural network and a second generative subnetwork and a second discriminative subnetwork of the second neural network are jointly trained to minimize a minimax objective function. 6. The method of claim 1 , further comprising: rendering the first image of the first modality and the second image of the second modality on a display device or transmitting the first image of the first modality and the second image of the second modality over a communication channel. 7. The method of claim 1 , wherein the first modality of the first image is a color image, and wherein the second modality of the second image is a depth image. 8. The method of claim 1 , wherein the first modality of the first image is a color image, and wherein the second modality of the second image is a thermal image. 9. The method of claim 1 , wherein the first modality of the first image is an image having a first style, and wherein the second modality of the second image is an image having a second style. 10. The method of claim 1 , wherein the first neural network and the second neural network are selected from a set of the neural networks jointly trained to produce a set of modalities of the digital image, comprising: processing the image with a set of neural networks to produce the multimodal digital image. 11. The method of claim 10 , wherein the set of the neural networks forms a Coupled Generative Adversarial Nets (CoGAN). 12. A system for generating a multimodal digital image, comprising: an input interface to acquire an image of a scene indicative of features of the scene; at least one non-transitory computer readable memory storing a first neural network trained to produce a first modality of the multimodal digital image and a second neural network trained to produce a second modality of the multimodal digital image, wherein a structure and a number of layers of the first neural network are identical to a structure and a number of layers of the second neural network, wherein at least one layer in the first neural network has parameters identical to parameters of a corresponding layer in the second neural network, and wherein at least one layer in the first neural network has parameters different from parameters of a corresponding layer in the second neural network, wherein the layers of the first and the second neural networks having identical parameters produce high-level features of the first and the second images of the multimodal digital image, and wherein the layers of the first and the second neural networks having different parameters produce low-level features of the first and the second images of the multimodal digital image, wherein the first neural network and the second neural network are trained jointly while enforcing identical parameters for several bottom layers of the first neural network and the second neural network, wherein at least one or both of the first neural network and the second neural network are trained using generative adversarial nets (GAN) including a generative subnetwork for producing a sample of the digital image of a specific modality and a discriminative subnetwork for testing if the sample of the digital image produced by the generative subnetwork has the specific modality; a processor to generate the multimodal digital image by processing the image with the first neural network to produce a first modality of a first image and processing the image with the second neural network to produce a second modality of a second image, such that the first image and the second image form the multimodal digital image; and an output interface to output the multimodal digital image. 13. The system of claim 12 , further comprising: a display device for displaying the multimodal digital image, such that the output interface outputs the multimodal digital image to the display device. 14. The system of claim 12 , wherein the high-level features are attributed to entire digital image and the low-level features are attributed to a portion of the digital image. 15. The system of claim 12 , wherein the first modality of the first image is a color image, and wherein the second modality of the second image is a depth image or a thermal image. 16. The system of claim 12 , wherein the first modality of the first image is an image having a first style, and wherein the second modality of the second image is an image having a second style. 17. A non-transitory computer-readable medium with instructions stored thereon, that when executed by a processor, perform the steps comprising: acquiring an image of a scene indicative of features of the scene; processing the image with a first neural network to produce a first image having a first modality; image; processing the image with a second neural network to produce a second image having a second modality, such that the first image and the second image form the multimodal digital image, the wherein a structure and a number of layers of the first neural network are identical to a structure and a number of layers of the

Assignees

Mitsubishi Electric Res Laboratories Inc

Inventors

Classifications

G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06V10/454Primary
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06T11/60
Creating or editing images; Combining images with text · CPC title
G06F18/24
Classification techniques · CPC title

Patent family

Related publications grouped by family.

View patent family 59153238

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9971958B2 cover?: A computer-implemented method generates a multimodal digital image by processing a vector with a first neural network to produce a first modality of the digital image and processing the vector with a second neural network to produce a second modality of the digital image. A structure and a number of layers of the first neural network are identical to a structure and a number of layers of the se…
Who is the assignee on this patent?: Mitsubishi Electric Res Laboratories Inc
What technology area does this patent fall under?: Primary CPC classification G06V10/454. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 15 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Determination of Font Similarity

Cross-trained convolutional neural networks using multimodal images

Fisher vectors meet neural networks: a hybrid visual classification architecture

Learning method and recording medium

Deep similarity learning for multimodal medical images

Method and apparatus for detecting a pedestrian by a vehicle during night driving

Frequently asked questions