Substitutional end-to-end video coding
US-2021360259-A1 · Nov 18, 2021 · US
US12294720B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12294720-B2 |
| Application number | US-202117500355-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 13, 2021 |
| Priority date | Apr 16, 2021 |
| Publication date | May 6, 2025 |
| Grant date | May 6, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Neural network based substitutional end-to-end (E2E) image compression (NIC) being performed by at least one processor and includes receiving an input image to an E2E NIC framework, determining a step size of the input image indicating a learning rate of a training model, determining a substitute image based on the training model, encoding the substitute image in lieu of the input image to generate a bitstream, and mapping the substitute image to the bitstream to generate a compressed representation. Further, step size may be determined by a scheduler and change throughout the training of the training model. The image may also be split into patches for which a scheduler is assigned for each patch and each patch is encoded instead of the entire input image.
Opening claim text (preview).
What is claimed is: 1. A method of substitutional end-to-end (E2E) neural image compression (NIC) using a neural network performed by at least one processor, the method comprising: receiving an input image to an E2E NIC framework; mapping from the input image x 0 in a high dimensional space to a bit-stream with length R(x 0 ); mapping the bit-stream with the length R(x 0 ) to a compressed representation ; determining whether there exists a substitution x′ 0 that is mapped to a substitution compressed representation such that a first distance measurement or loss function between and x 0 is less than between a second distance measurement or loss function between and x 0 ; when the substitution x′ 0 that is mapped to that is closer to x 0 given the second distance measurement or loss function exists, determining a substitute image based on a training model and a step size, wherein, the substitute image is different from the input image; encoding the substitute image to generate a compressed representation of the substitute image; and outputting, as an encoding of the input image for decoding by a decoder, the compressed representation of the substitute image, wherein the compressed representation of the substitute image replaces a compressed representation of the input image in the E2E NIC framework. 2. The method according to claim 1 , wherein the substitute image is determined by performing an optimization process of the training model, comprising: adjusting elements of the input image to generate substitute representations; and selecting the elements with a least distortion loss between the input image and the substitute representations to use as the substitute image. 3. The method according to claim 1 , wherein the substitute image maps to the input image. 4. The method according to claim 1 , wherein the training model is trained based on the determined step size, a number of updates to the input image, and a distortion loss; and wherein the step size can be increasing, decreasing, or kept the same for one or more iterations of the training model. 5. The method according to claim 1 , wherein a plurality of substitute images are determined based on a plurality of step sizes, wherein step size values corresponding to the plurality of step sizes are determined based on a plurality of schedulers, and wherein a substitution image with a highest compression performance is selected for encoding. 6. The method according to claim 5 , further comprising: splitting the input image into one or more patches, wherein each of the one or more patches is assigned a scheduler from the plurality of schedulers. 7. An apparatus for substitutional end-to-end (E2E) neural image compression (NIC) using a neural network, the apparatus comprising: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code comprising: receiving code configured to cause the at least one processor to receive an input image to an E2E NIC framework; mapping code configured to cause the at least one processor to map, from the input image x 0 in a high dimensional space to a bit-stream with length R(x 0 ), and map the bit-stream with the length R(x 0 ) to a compressed representation ; first determining code configured to cause the at least one processor to determine whether there exists a substitution x′ 0 that is mapped to a substitution compressed representation such that a first distance measurement or loss function between and x 0 is less than between a second distance measurement or loss function between and x 0 ; second determining code configured to cause the at least one processor to determine, when the substitution x′ 0 that is mapped to that is closer to x 0 given the second distance measurement or loss function exists, a substitute image based on a training model and a step size, wherein, the substitute image is different from the input image; encoding code configured to cause the at least one processor to encode the substitute image to generate a compressed representation; and mapping code configured to cause the at least one processor to output, as an encoding of the input image for decoding by a decoder, the compressed representation of the substitute image, wherein the compressed representation of the substitute image replaces a compressed representation of the input image in the E2E NIC framework. 8. The apparatus according to claim 7 , wherein the substitute image is determined by performing an optimization process of the training model, comprising: adjusting code configured to cause the at least one processor to adjust elements of the input image to generate substitute representations; and selecting code configured to cause the at least one processor to select the elements with a least distortion loss between the input image and the substitute representations to use as the substitute image. 9. The apparatus according to claim 7 , wherein the substitute image maps to the input image. 10. The apparatus according to claim 7 , wherein the training model is trained based on the determined step size, a number of updates to the input image, and a distortion loss; and wherein the step size can be increasing, decreasing, or kept the same for one or more iterations of the training model. 11. The apparatus according to claim 7 , wherein a plurality of substitute images are determined based on a plurality of step sizes, wherein step size values corresponding to the plurality of step sizes are determined based on a plurality of schedulers, and wherein a substitution image with a highest compression performance is selected for encoding. 12. The apparatus according to claim 11 , further comprising: splitting code configured to cause the at least one processor to split the input image into one or more patches; wherein each of the one or more patches is assigned a scheduler from the plurality of schedulers. 13. A non-transitory computer readable medium storing instructions that, when executed by at least one processor for substitutional end-to-end (E2E) neural image compression (NIC), cause the at least one processor to: receive an input image to an E2E NIC framework; map from the input image x 0 in a high dimensional space to a bit-stream with length R(x 0 ); map the bit-stream with the length R(x 0 ) to a compressed representation ; determine whether there exists a substitution x′ 0 that is mapped to a substitution compressed representation such that a first distance measurement or loss function between and x 0 is less than between a second distance measurement or loss function between and x 0 ; when the substitution x′ 0 that is mapped to that is closer to x 0 given the second distance measurement or loss function exists, determine a substitute image based on a training model and a step size, wherein, the substitute image is different from the input image; encode the substitute image to generate a compressed representation of the substitute image; and output, as an encoding of the input image for decoding by a decoder, the compressed representation of the substitute image, wherein the compressed representation of the substitute image replaces a compressed representation of the input image in the E2E NIC framework. 14. The non-transitory computer readable medium of claim 13 , wherein the instructions, when executed by the at least one processor, further cause the at least one processor to performing an optimization process of the training model, compris
Embedding additional information in the video signal during the compression process (H04N19/517, H04N19/68, H04N19/70 take precedence) · CPC title
the region being a picture, frame or field · CPC title
characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title
Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title
the unit being bits, e.g. of the compressed video stream · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.