Method and apparatus for adaptive neural image compression with rate control by meta-learning
US-2022230362-A1 · Jul 21, 2022 · US
US12283075B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12283075-B2 |
| Application number | US-202117499959-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 13, 2021 |
| Priority date | Apr 16, 2021 |
| Publication date | Apr 22, 2025 |
| Grant date | Apr 22, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Neural network based substitutional end-to-end (E2E) image compression (NIC) being performed by at least one processor and includes receiving an input image to an E2E NIC framework, determining a substitute image based on a training model of the E2E NIC framework, encoding the substitute image to generate a bitstream, mapping the substitute image to the bitstream to generate a compressed representation of the input image. Further, the input may be partitioned into blocks for which a substitute representation is determined for each block and each block is encoded instead of the entire substitute image.
Opening claim text (preview).
What is claims is: 1. A method of substitutional end-to-end (E2E) neural image compression (NIC) using a neural network performed by at least one processor, the method comprising: receiving an input image to an E2E NIC framework; splitting the input image into one or more blocks; performing an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; performing a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; determining a substitute image from the original space, based on a training model of the E2E NIC framework; encoding the substitute image to generate a second bitstream; and mapping the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the substitute image is determined by performing an optimization process of the training model of the E2E NIC framework, comprising: adjusting RGB variance of the split blocks to generate substitute block representations; and selecting the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 2. The method according to claim 1 , further comprising: determining a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; encoding the substitute block to generate a block bitstream; and mapping the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 3. The method according to claim 1 , wherein the training model of the E2E NIC framework is an artificial neural network based on pretrained image coding, and wherein parameters of the artificial neural network are fixed and a gradient is used to update the input image. 4. An apparatus for substitutional end-to-end (E2E) neural image compression (NIC) using a neural network, the apparatus comprising: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code comprising: receiving code configured to cause at least one processor to receive an input image to an E2E NIC framework; splitting code configured to cause at least one processor to split the input image into one or more blocks; first performing code configured to cause the at least one processor to perform an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; second performing code configured to cause the at least one processor to perform a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; first determining code configured to cause at least one processor to determine a substitute image from the original space, based on a training model of the E2E NIC framework; first encoding code configured to cause at least one processor to encode the substitute image to generate a second bitstream; and first mapping code configured to cause at least one processor to map the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the substitute image is determined by performing an optimization process of the training model of the E2E NIC framework, comprising: adjusting code configured to cause at least one processor to adjust RGB variance of the split blocks to generate substitute block representations; and selecting code configured to cause at least one processor to select the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 5. The apparatus of claim 4 , further comprising: second determining code configured to cause at least one processor to determine a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; second encoding code configured to cause at least one processor to encode the substitute block to generate a block bitstream; and second mapping code configured to cause at least one processor to map the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 6. The apparatus according to claim 4 , wherein the training model of the E2E NIC framework is an artificial neural network based on pretrained image coding, and wherein parameters of the artificial neural network are fixed and a gradient is used to update the input image. 7. A non-transitory computer readable medium storing instructions that, when executed by at least one processor for substitutional end-to-end (E2E) neural image compression (NIC), cause the at least one processor to: receive an input image to an E2E NIC framework; split the input image into one or more blocks; perform an encoding mapping, for each of the one or more blocks, by mapping the input image to a first bitstream having a first length; perform a decoding mapping, for each of the one or more blocks, by mapping the first bitstream back to an original space with a first distortion loss; determine a substitute image from the original space, based on a training model of the E2E NIC framework; encode the substitute image to generate a second bitstream; and map the substitute image to the second bitstream to generate a compressed representation, wherein the training model of the E2E NIC framework is trained based on a learning rate of the input image, a quantity of updates to the input image, and a second distortion loss, wherein a plurality of substitute images are determined based on learning rates that are selected based on characteristics of the input image, and wherein the instructions, when executed by at least one processor, further cause the at least one processor to performing an optimization process of the training model of the E2E NIC framework, comprising: adjust RGB variance of the split blocks to generate substitute block representations; and select the RGB variance with a least distortion loss between the split blocks and the substitute block representations to use as the substitute block. 8. The non-transitory computer readable medium of claim 7 , wherein the instructions, when executed by at least one processor, further cause the at least one processor to: determine a substitute block for each of the one or more blocks, based on the training model of the E2E NIC framework; encode the substitute block to generate a block bitstream; and map the substitute block to the block bitstream to generate a compressed block, wherein the one or more blocks have a same size, and each block of the one or more blocks has a different learning rate. 9. The non-transitory computer readable medium of claim 7 , where
Time or data compression or expansion (audio compression based on psychoacoustics G10L19/00; data processing for reproducing audio data at different playback speeds G10L21/04; video compression H04N19/00; data compression per se H03M7/30) · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.