Iterative multiscale image generation using neural networks

US11734797B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11734797-B2
Application numberUS-202217751359-A
CountryUS
Kind codeB2
Filing dateMay 23, 2022
Priority dateFeb 24, 2017
Publication dateAug 22, 2023
Grant dateAug 22, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of generating an output image having an output resolution of N pixels×N pixels, each pixel in the output image having a respective color value for each of a plurality of color channels, the method comprising: obtaining a low-resolution version of the output image; and upscaling the low-resolution version of the output image to generate the output image having the output resolution by repeatedly performing the following operations: obtaining a current version of the output image having a current K×K resolution; and processing the current version of the output image using a set of convolutional neural networks that are specific to the current resolution to generate an updated version of the output image having a 2K×2K resolution.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of generating an output image having an output resolution, each pixel in the output image having a respective value for each of one or more channels, the method comprising: obtaining a low-resolution version of the output image; and upscaling the low-resolution version of the output image to generate the output image having the output resolution by repeatedly performing the following operations until an image with an output resolution is obtained: obtaining a current version of the output image having a current resolution; and processing the current version of the output image using a set of convolutional neural networks that are specific to the current resolution to generate an updated version of the output image having an updated resolution that is higher than the current resolution, wherein the set of convolutional neural networks that are specific to the current resolution comprises: a first convolutional neural network that is configured to receive a first input comprising the current version of the image and to generate a first output image that includes columns of pixels from an intermediate version of the output image having an intermediate resolution that is higher than the current resolution but lower than the updated resolution, and a second convolutional neural network that is configured to receive a second input comprising the intermediate version of the output image and to generate a second output image that includes rows of pixels from the updated version of the output image. 2. The method of claim 1 , wherein obtaining the low-resolution version comprises: generating the low-resolution version using an image generation machine learning model. 3. The method of claim 2 , wherein the image generation machine learning model is an autoregressive image generation machine learning model. 4. The method of claim 2 , wherein the output image is conditioned on an input context and wherein the image generation machine learning model is configured to generate the low-resolution version conditioned on the input context. 5. The method of claim 4 , wherein each convolutional neural network in each set of convolutional neural networks is conditioned on the input context. 6. The method of claim 1 , wherein processing the current version of the output image using the set of convolutional neural networks that are specific to the current resolution to generate the updated version of the output image comprises: processing the current version of the output image using the first convolutional neural network to generate the first output image; generating the intermediate version by merging the current version and the first output image; processing the intermediate version using the second convolutional neural network to generate the second output image; and generating the updated version by merging the intermediate version and the second output image. 7. The method of claim 6 , wherein merging the current version and the first output image comprises: generating an intermediate image that includes K columns of pixels from the current image and K columns of pixels from the first output image by alternating columns of pixels from the current version with columns of pixels from the first output image. 8. The method of claim 6 , wherein merging the intermediate version and the second output image comprises: generating an updated image that includes a plurality of rows of pixels from the intermediate version and a plurality of rows of pixels from the second output image by alternating rows of pixels from the intermediate version with rows of pixels from the second output image. 9. The method of claim 6 , wherein the one or more channels are ordered according to a channel order, wherein the first convolutional neural network is configured to, for each of the one or more channels: generate values for the channel for pixels in the first output image conditioned (i) on the current version and (ii) on values for pixels in the first output image for any channels before the channel in the channel order and (iii) not on values for pixels in the first output image for any channels that are after the channel in the channel order, and wherein the second convolutional neural network is configured to, for each of the one or more channels: generate values for the channel for pixels in the second output image conditioned (i) on the intermediate version and (ii) on values for pixels in the second output image for any channels before the channel in the channel order and (iii) not on values for pixels in the second output image for any channels that are after the channel in the channel order. 10. The method of claim 9 , wherein processing the current version using the first convolutional neural network to generate the first output image comprises: iteratively processing the current version and values from the first output image that have already been generated to generate the first output image, and wherein processing the intermediate version using the second convolutional neural network to generate the second output image comprises: iteratively processing the intermediate version and the values from the second output image that have already been generated to generate the second output image. 11. A method of generating an output image having an output resolution, each pixel in the output image having a respective value for each of one or more channels, the method comprising: obtaining a low-resolution version of the output image; and upscaling the low-resolution version of the output image to generate the output image having the output resolution by repeatedly performing the following operations until an image with an output resolution is obtained: obtaining a current version of the output image having a current resolution; and processing the current version of the output image using a set of convolutional neural networks that are specific to the current resolution to generate an updated version of the output image having an updated resolution that is higher than the current resolution, wherein the set of convolutional neural networks that are specific to the current resolution comprises: a first convolutional neural network that is configured to receive a first input comprising the current version of the image and to generate a first output image that includes rows of pixels from an intermediate version of the output image having an intermediate resolution that is higher than the current resolution but lower than the updated resolution, and a second convolutional neural network that is configured to receive a second input comprising the intermediate version of the output image and to generate a second output image that includes columns of pixels from the updated version of the output image. 12. The method of claim 11 , wherein processing the current version of the output image using the set of convolutional neural networks that are specific to the current resolution to generate the updated version comprises: processing the current version using the first convolutional neural network to generate the first output image; generating the intermediate version by merging the current version and the first output image; processing the intermediate version using the second convolutional neural network to generate the second output image; and generating the updated version by merging the intermediate version and the second output image. 13. The method of claim 12 , wherein merging the current version and the first output image comprises: generating an intermediate image that includes rows of pixels from the current image and rows

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • G06T3/4046Primary

    using neural networks · CPC title

  • Combinations of networks · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11734797B2 cover?
A method of generating an output image having an output resolution of N pixels×N pixels, each pixel in the output image having a respective color value for each of a plurality of color channels, the method comprising: obtaining a low-resolution version of the output image; and upscaling the low-resolution version of the output image to generate the output image having the output resolution by r…
Who is the assignee on this patent?
Deepmind Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06T3/4046. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).