Tiled image compression using neural networks

US11250595B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11250595-B2
Application numberUS-201816617484-A
CountryUS
Kind codeB2
Filing dateMay 29, 2018
Priority dateMay 26, 2017
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. An image encoder system receives a request to generate an encoded representation of an input image that has been partitioned into a plurality of tiles and generates the encoded representation of the input image. To generate the encoded representation, the system processes a context for each tile using a spatial context prediction neural network that has been trained to process context for an input tile and generate an output tile that is a prediction of the input tile. The system determines a residual image between the particular tile and the output tile generated by the spatial context prediction neural network by process the context for the particular tile and generates a set of binary codes for the particular tile by encoding the residual image using an encoder neural network.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by an image encoder system comprising one or more computers, a request to generate an encoded representation of an input image that has been partitioned into a plurality of tiles; and generating, by the image encoder system, the encoded representation of the input image, wherein the encoded representation includes a respective set of binary codes for each of the plurality of tiles, and wherein the generating comprises, for a particular tile of the plurality of tiles: processing a context for the particular tile using a spatial context prediction neural network that has been trained to process context for an input tile to generate an output tile that is a prediction of the input tile, wherein the context comprises one or more tiles neighboring the particular tile in the input image; determining a residual image between the particular tile and the output tile generated by the spatial context prediction neural network by processing the context for the particular tile; and generating the set of binary codes for the particular tile by encoding the residual image using an encoder neural network, wherein the encoder neural network is configured to encode the residual image by, at each of a plurality of time steps: receiving an encoder input for the time step; and processing the encoder input to generate a set of binary codes for the time step. 2. The method of claim 1 , further comprising: compressing the input image by compressing the binary codes in the encoded representation using a data compression algorithm. 3. The method of claim 2 , wherein the data compression algorithm is a trained entropy coder. 4. The method of claim 2 , further comprising: transmitting the compressed input image to an image decoder system for decompression of the input image. 5. The method of claim 1 , wherein the encoder neural network is a recurrent neural network. 6. The method of claim 5 , wherein the encoder input for a first time step of the plurality of time steps is the residual image. 7. The method of claim 5 , wherein the encoder input for a time step of the plurality of time steps after the first time step is a temporary residual image between (i) the residual and (ii) a reconstruction generated by a decoder neural network from the set of binary codes at the previous time step, wherein the decoder neural network is a recurrent neural network that is configured to, at each of the plurality of time steps, receive a decoder input comprising the set of binary codes for the time step and to process the decoder input to generate a reconstruction of the encoder input at the time step. 8. The method of claim 7 , wherein generating the set of binary codes for the particular tile comprises, at each of the plurality of time steps: determining from the reconstruction of the encoder input for the time step whether a quality threshold for the particular tile when reconstructed from the binary codes already generated at the time step and any previous time steps has been satisfied. 9. The method of claim 8 , wherein generating the set of binary codes for the particular tile comprises: in response to determining that the quality threshold has been satisfied, using the already generated binary codes as the set of binary codes for the particular tile in the encoded representation of the input image. 10. The method of claim 1 wherein generating the set of binary codes for the particular tile comprises, at each of a plurality of time steps: determining whether a quality threshold for the particular tile has been satisfied when the particular tile is reconstructed from the set of binary codes generated at the current time step; and in response to determining the quality threshold is satisfied, using the set of binary codes generated at the current time step for the particular tile as the set of binary codes for the particular tile in the encoded representation of the input image. 11. The method of claim 1 , wherein, when the particular tile is not on a left or top border of the input image, the context is the neighboring tiles to the left and above the particular tile in the input image. 12. The method of claim 11 , wherein when the particular tile is on the left border of the input image and is not in a top left corner of the input image, the context is the neighboring tile above the particular tile and placeholder context data. 13. The method of claim 12 , wherein when the particular tile is in the top left corner of the input image, the context is placeholder context data. 14. The method of claim 11 , wherein when the particular tile is on a top border of the input image and is not in the top left corner of the input image, the context is the neighboring tile to the left of the particular tile and placeholder context data. 15. A method comprising: receiving, by an image decoder system comprising one or more computers, a request to reconstruct an input image from an encoded representation of the input image, wherein the input image has been partitioned into a plurality of tiles, and wherein the encoded representation includes a respective set of binary codes for each of the plurality of tiles; and generating, by the image decoder system, a reconstruction of the input image, wherein the generating comprises, for a particular tile of the plurality of tiles: processing a context for the particular tile using a spatial context prediction neural network that has been trained to process context for an input tile to generate an output tile that is an initial reconstruction image of the input tile, wherein the context comprises reconstructions of one or more tiles neighboring the particular tile in the input image; generating a residual reconstruction image of the particular tile by processing the set of binary codes for the tile using a decoder neural network, wherein the set of binary codes for the particular tile includes a respective subset of binary codes for each of a plurality of time steps, and wherein the decoder neural network is configured to generate a residual image by, at each of the plurality of time steps, processing the subset of binary codes for the time step to generate a time step reconstruction residual image; and combining the initial reconstruction image and the residual reconstruction image of the particular tile to generate a final reconstruction of the particular tile. 16. The method of claim 15 , wherein generating the reconstruction further comprises: receiving a compressed input image; and decompressing the compressed input image using a data decompression algorithm to generate the respective sets of binary codes for the tiles. 17. The method of claim 15 , wherein the set of binary codes for the particular tile includes a respective subset of binary codes for each of a plurality of time steps, and wherein the decoder neural network is a recurrent neural network configured to generate the residual by, at each of the plurality of time steps: processing the subset of binary codes for the time step to generate a time step reconstruction residual image. 18. The method of claim 17 , wherein generating the reconstruction residual image comprises: combining the time step reconstruction residual images for the plurality of time steps. 19. The method of claim 15 , wherein, when the particular tile is not on a left or top border of the input image, the context is the reconstructions of neighboring tiles to the left and above the particular tile in the input image.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11250595B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for image compression and reconstruction. An image encoder system receives a request to generate an encoded representation of an input image that has been partitioned into a plurality of tiles and generates the encoded representation of the input image. To generate the encoded representation, the sys…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).