Block-based long-range context model in neural image compression

US12495151B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12495151-B2
Application numberUS-202318458507-A
CountryUS
Kind codeB2
Filing dateAug 30, 2023
Priority dateDec 27, 2022
Publication dateDec 9, 2025
Grant dateDec 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatuses for decoding a compressed image using a neural image compression network may be provided. The method may include generating long-range context model parameters associated with a high resolution compressed image, the long-range context model parameters corresponding to a first area. The method may also include splitting the generated long-range context model parameters into a first number of context parameter blocks. The method may also include for each block in the first number of context parameter blocks, predicting respective context features using a long-range context model and respective context parameter blocks, wherein the long-range context model uses a corner-to-center latent decoding strategy or an edge-to-center latent decoding strategy to decode latents associated with the high resolution compressed image. Then, the high resolution compressed image may be reconstructed based on predicted context features.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for decoding a compressed image using a neural image compression network, the method being executed by at least one processor, the method comprising: receiving a high resolution compressed image; generating long-range context model parameters associated with the high resolution compressed image, the long-range context model parameters corresponding to a first area; splitting the generated long-range context model parameters into a first number of context parameter blocks; for each block in the first number of context parameter blocks, predicting respective context features using a long-range context model and respective context parameter blocks, wherein the long-range context model uses a corner-to-center latent decoding or an edge-to-center latent decoding to decode latents associated with the high resolution compressed image; and reconstructing the high resolution compressed image based on the predicted context features. 2 . The method of claim 1 , wherein the predicting the respective context features of each of the first number of context parameter blocks is performed in parallel. 3 . The method of claim 1 , wherein the predicting the respective context features of each of the first number of context parameter blocks is performed in series. 4 . The method of claim 1 , wherein predicting the respective context features comprises: predicting first context features using the long-range context model and first long-range context model parameters from a first context parameter block, wherein the first long-range context model parameters correspond to first locations in a first split area, and wherein the first context features that are predicted using the first long-range context model parameters correspond to second locations in the first split area; and predicting second context features using the long-range context model and second long-range context model parameters from the first context parameter block, wherein the second long-range context model parameters correspond to the second locations in the first split area, and wherein the second context features that are predicted using the second long-range context model parameters correspond to third locations in the first split area. 5 . The method of claim 4 , wherein according to the corner-to-center latent decoding comprises: the first locations in the first split area are corner locations in the first split area, the second locations in the first split area are locations at mid-points between one or more of the corner locations, and the third locations in the first split area are locations at mid-points between one or more of the second locations. 6 . The method of claim 4 , wherein the edge-to-center latent decoding comprises: the first locations in the first split area being edge locations in the first split area, the second locations in the first split area being locations at mid-points between one or more of the edge locations or corner locations associated respective edges of the first split area, and the third locations in the first split area being locations at mid-points between one or more of the second locations. 7 . The method of claim 1 , wherein each of the first number of split areas has a different shape, and wherein each of the first number of split areas is one of a square area or a rectangle area. 8 . The method of claim 1 , wherein the long-range context model is a transformer-based context prediction model. 9 . An apparatus for decoding a compressed image using a neural image compression network, the apparatus comprising: at least one memory configured to store computer program code; and at least one processor configured to read the computer program code and operate as instructed by the computer program code, the computer program code including: receiving code configured to cause the at least one processor to receive a high resolution compressed image; generating code configured to cause the at least one processor to generate long-range context model parameters associated with the high resolution compressed image, the long-range context model parameters corresponding to a first area; splitting code configured to cause the at least one processor to split the generated long-range context model parameters into a first number of context parameter blocks; predicting code configured to cause the at least one processor to predict, for each block in the first number of context parameter blocks, respective context features using a long-range context model and respective context parameter blocks, wherein the long-range context model uses a corner-to-center latent decoding or an edge-to-center latent decoding to decode latents associated with the high resolution compressed image; and reconstructing code configured to cause the at least one processor to reconstruct the high resolution compressed image based on the predicted context features. 10 . The apparatus of claim 9 , wherein the predicting the respective context features of each of the first number of context parameter blocks is performed in parallel. 11 . The apparatus of claim 9 , wherein the predicting the respective context features of each of the first number of context parameter blocks is performed in series. 12 . The apparatus of claim 9 , wherein the predicting code is further configured to cause the at least one processor to: predict first context features using the long-range context model and first long-range context model parameters from a first context parameter block, wherein the first long-range context model parameters correspond to first locations in a first split area, and wherein the first context features that are predicted using the first long-range context model parameters correspond to second locations in the first split area; and predict second context features using the long-range context model and second long-range context model parameters from the first context parameter block, wherein the second long-range context model parameters correspond to the second locations in the first split area, and wherein the second context features that are predicted using the second long-range context model parameters correspond to third locations in the first split area. 13 . The apparatus of claim 12 , wherein according to the corner-to-center latent decoding comprises: the first locations in the first split area are corner locations in the first split area, the second locations in the first split area are locations at mid-points between one or more of the corner locations, and the third locations in the first split area are locations at mid-points between one or more of the second locations. 14 . The apparatus of claim 9 , wherein each of the first number of split areas has a different shape, and wherein each of the first number of split areas is one of a square area or a rectangle area. 15 . A non-transitory computer-readable medium storing instructions that, when executed by at least one processor of an apparatus for decoding a compressed image using a neural image compression network, cause the at least one processor to: receive a high resolution compressed image; generate long-range context model parameters associated with the high resolution compressed image, the long-range context model parameters corresponding to a first area; split the generated long-range context model parameters into a first number of context parameter blocks; for each block in the first number of context parameter blocks, predict respective context features using a long-range context model and r

Assignees

Inventors

Classifications

  • Position within a video image, e.g. region of interest [ROI] · CPC title

  • Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title

  • the region being a block, e.g. a macroblock · CPC title

  • H04N19/42Primary

    characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12495151B2 cover?
Methods and apparatuses for decoding a compressed image using a neural image compression network may be provided. The method may include generating long-range context model parameters associated with a high resolution compressed image, the long-range context model parameters corresponding to a first area. The method may also include splitting the generated long-range context model parameters in…
Who is the assignee on this patent?
Tencent America LLC
What technology area does this patent fall under?
Primary CPC classification H04N19/42. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).