Method and apparatus for multi-learning rates of substitution in neural image compression

US12283075B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12283075-B2
Application numberUS-202117499959-A
CountryUS
Kind codeB2
Filing dateOct 13, 2021
Priority dateApr 16, 2021
Publication dateApr 22, 2025
Grant dateApr 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Neural network based substitutional end-to-end (E2E) image compression (NIC) being performed by at least one processor and includes receiving an input image to an E2E NIC framework, determining a substitute image based on a training model of the E2E NIC framework, encoding the substitute image to generate a bitstream, mapping the substitute image to the bitstream to generate a compressed representation of the input image. Further, the input may be partitioned into blocks for which a substitute representation is determined for each block and each block is encoded instead of the entire substitute image.

First claim

Opening claim text (preview).

What is claims is: 1. A method of substitutional end-to-end (E2E) neural image compression (NIC) using a neural network performed by at least one processor, the method comprising: receiving an input image to an E2E NIC framework; splitting the input image into one or more blocks; performing an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; performing a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; determining a substitute image from the original space, based on a training model of the E2E NIC framework; encoding the substitute image to generate a second bitstream; and mapping the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the substitute image is determined by performing an optimization process of the training model of the E2E NIC framework, comprising: adjusting RGB variance of the split blocks to generate substitute block representations; and selecting the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 2. The method according to claim 1 , further comprising: determining a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; encoding the substitute block to generate a block bitstream; and mapping the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 3. The method according to claim 1 , wherein the training model of the E2E NIC framework is an artificial neural network based on pretrained image coding, and wherein parameters of the artificial neural network are fixed and a gradient is used to update the input image. 4. An apparatus for substitutional end-to-end (E2E) neural image compression (NIC) using a neural network, the apparatus comprising: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code comprising: receiving code configured to cause at least one processor to receive an input image to an E2E NIC framework; splitting code configured to cause at least one processor to split the input image into one or more blocks; first performing code configured to cause the at least one processor to perform an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; second performing code configured to cause the at least one processor to perform a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; first determining code configured to cause at least one processor to determine a substitute image from the original space, based on a training model of the E2E NIC framework; first encoding code configured to cause at least one processor to encode the substitute image to generate a second bitstream; and first mapping code configured to cause at least one processor to map the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the substitute image is determined by performing an optimization process of the training model of the E2E NIC framework, comprising: adjusting code configured to cause at least one processor to adjust RGB variance of the split blocks to generate substitute block representations; and selecting code configured to cause at least one processor to select the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 5. The apparatus of claim 4 , further comprising: second determining code configured to cause at least one processor to determine a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; second encoding code configured to cause at least one processor to encode the substitute block to generate a block bitstream; and second mapping code configured to cause at least one processor to map the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 6. The apparatus according to claim 4 , wherein the training model of the E2E NIC framework is an artificial neural network based on pretrained image coding, and wherein parameters of the artificial neural network are fixed and a gradient is used to update the input image. 7. A non-transitory computer readable medium storing instructions that, when executed by at least one processor for substitutional end-to-end (E2E) neural image compression (NIC), cause the at least one processor to: receive an input image to an E2E NIC framework; split the input image into one or more blocks; perform an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; perform a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; determine a substitute image from the original space, based on a training model of the E2E NIC framework; encode the substitute image to generate a second bitstream; and map the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the instructions, when executed by at least one processor, further cause the at least one processor to performing an optimization process of the training model of the E2E NIC framework, comprising: adjust RGB variance of the split blocks to generate substitute block representations; and select the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 8. The non-transitory computer readable medium of claim 7 , wherein the instructions, when executed by at least one processor, further cause the at least one processor to: determine a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; encode the substitute block to generate a block bitstream; and map the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 9. The non-transitory computer readable medium of claim 7 , where

Assignees

Inventors

Classifications

  • Time or data compression or expansion (audio compression based on psychoacoustics G10L19/00; data processing for reproducing audio data at different playback speeds G10L21/04; video compression H04N19/00; data compression per se H03M7/30) · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12283075B2 cover?
Neural network based substitutional end-to-end (E2E) image compression (NIC) being performed by at least one processor and includes receiving an input image to an E2E NIC framework, determining a substitute image based on a training model of the E2E NIC framework, encoding the substitute image to generate a bitstream, mapping the substitute image to the bitstream to generate a compressed repres…
Who is the assignee on this patent?
Tencent America LLC
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).