Network based image filtering for video coding
US-2024064296-A1 · Feb 22, 2024 · US
US12439092B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12439092-B2 |
| Application number | US-202318205475-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 2, 2023 |
| Priority date | Dec 4, 2020 |
| Publication date | Oct 7, 2025 |
| Grant date | Oct 7, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and an apparatus for image filtering in video coding using a neural network are provided. The method includes: loading a plurality of input patches associated with a current image to be coded, where the plurality of input patches include a first input patch with a first resolution, a second input patch with a second resolution, and a third input patch with a third resolution; and in response to determining that one resolution in the first resolution, the second resolution, and the third resolution is different from the other two resolutions, adjusting the first resolution, the second resolution, and the third resolution at one region of a plurality of regions before the neural network or in the neural network.
Opening claim text (preview).
What is claimed is: 1. A method for image filtering in video coding, comprising: loading a plurality of input patches associated with a current image to be coded, wherein the plurality of input patches comprise a first input patch with a first resolution, a second input patch with a second resolution, and a third input patch with a third resolution; and in response to determining that one resolution in the first resolution, the second resolution, and the third resolution is different from the other two resolutions, adjusting the first resolution, the second resolution, and the third resolution at one region of a plurality of regions; wherein the one region of a plurality of regions comprises a first region; the first region comprises a plurality of input layers receiving the plurality of input patches before a neural network for image filtering and a plurality of output layers outputting a plurality of output patches after the neural network; wherein the method further comprising: loading a plurality of quantization parameter (QP) map (QpMap) values at a plurality of QpMap channels, wherein the plurality of QpMap values comprise a first QpMap value at a first QpMap channel, a second QpMap value at a second QpMap channel, and a third QpMap value at a third QpMap channel; and adjusting the first QpMap value, the second QpMap value, and the third QpMap value so that the first QpMap value, the second QpMap value, and the third QpMap value are within a dynamic range; wherein the first QpMap value, the second QpMap value, and the third QpMap value are evenly distributed at different positions of the dynamic range. 2. The method of claim 1 , wherein the plurality of regions further comprise a second region in the neural network, and a third region; wherein the second region comprises a plurality of input convolution layers and a plurality of output convolution layers, and the plurality of input convolution layers subsequently follow the plurality of input layers and perform convolution on the plurality of input patches; and wherein the third region comprises a plurality of residual blocks, and the plurality of output convolution layers subsequently follow the plurality of residual blocks. 3. The method of claim 1 , wherein adjusting the first resolution, the second resolution, and the third resolution comprises: aligning the first resolution, the second resolution, and the third resolution using the plurality of input layers in the first region. 4. The method of claim 3 , wherein aligning the first resolution, the second resolution, and the third resolution comprises: down-sampling the first input patch into a plurality of first input sub-patches having same resolution as the second input patch or the third input patch using the plurality of input layers in the first region; and the method further comprises: combining the plurality of first input sub-patches with the second input patch and the third input patch; generating a plurality of output patches corresponding to the plurality of input patches, wherein the plurality of output patches comprise a plurality of first output sub-patches, a second output patch, and a third output patch; and up-sampling the plurality of first output sub-patches using the plurality of output layers in the first region. 5. The method of claim 3 , wherein aligning the first resolution, the second resolution, and the third resolution comprises: up-sampling the second output patch or the third output patch using the plurality of input layers in the first region; and down-sampling the second output patch or the third output patch using the plurality of output layers in the first region. 6. The method of claim 2 , further comprising: loading the first input patch into a first input convolution layer in the second region and adjusting, by the first input convolution layer, the first resolution to align with the second resolution or the third resolution; loading the second input patch into a second input convolution layer in the second region and adjusting, by the second input convolution layer, the second resolution to align with the first resolution or the third resolution; and loading the third input patch into a third input convolution layer in the second region and adjusting, by the third input convolution layer, the third resolution to align with the first resolution or the second resolution. 7. The method of claim 6 , wherein adjusting the first resolution to align with the second resolution or the third resolution comprises: scaling down, by the first input convolution layer with a stride size greater than a second input convolution layer or a third input convolution layer, the first resolution to align with the second resolution or the third resolution; wherein adjusting the second resolution to align with the first resolution or the third resolution comprises: scaling up, by a pixel shuffle layer in the second region, the second resolution to align with the first resolution or the third resolution; wherein adjusting the third resolution to align with the first resolution or the second resolution comprises: scaling up, by a pixel shuffle layer in the second region, the third resolution to align with the first resolution or the second resolution. 8. The method of claim 7 , further comprising: scaling up, by a pixel shuffle layer in the second region, a resolution of a first output patch corresponding to the first input patch such that the first output patch has the same resolution as the first input patch; scaling down, by a resolution increase layer in the second region, a resolution of a second output patch corresponding to the second input patch such that the second output patch has the same resolution as the second input patch; scaling down, by a resolution increase layer in the second region, a resolution of a third output patch corresponding to the third input patch such that the third output patch has the same resolution as the third input patch. 9. The method of claim 2 , further comprising: loading the first input patch into a first input convolution layer in the second region, adjusting the first resolution, and generating a first convolution output; loading the second input patch into a second input convolution layer in the second region, adjusting the second resolution, and generating a second convolution output; loading the third input patch into a third input convolution layer in the second region, adjusting the third resolution, and generating a third convolution output; generating a concatenated output by concatenating the first convolution output, the second convolution output, and third convolution output; loading the concatenated output into a first residual block of the plurality of residual blocks and generating a residual output by a last residual block of the plurality of residual blocks; respectively loading the residual output into the plurality of output convolution layers; adjusting, by a first output convolution layer, a resolution of the residual output loaded to the first output convolution layer to align with the first resolution of the first input patch loaded into the first input convolution layer; adjusting, by a second output convolution layer, a resolution of the residual output loaded to the second output convolution layer to align with the second resolution of the second input patch loaded into the second input convolution layer; and adjusting, by a third output convolution layer, a resolution of the residual output loaded to the third output convolution layer to align with the third resolution of the third input patch loaded into the third input convolution layer. 10. The method of claim 2 , wherein the neur
Activation functions · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Combinations of networks · CPC title
using neural networks · CPC title
characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.