Method and apparatus for video coding using deep learning based in-loop filter for inter prediction

US2026039825A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2026039825-A1
Application numberUS-202519352059-A
CountryUS
Kind codeA1
Filing dateOct 7, 2025
Priority dateMar 31, 2021
Publication dateFeb 5, 2026
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and an apparatus for video coding using a deep learning-based in-loop filter for inter-prediction are disclosed. The video coding method and the apparatus utilize a deep learning-based in-loop filter for inter-prediction of a predictive frame (P-frame) and a bi-predictive frame (B-frame) in order to mitigate various levels of image distortion according to a QP (quantization parameter) value present in the P-frame and the B-frame.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for filtering an image area performed by a video decoding device, the method comprising: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and prediction type information of the image area; and generating a filtered image area of the image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers. 2 . The method of claim 1 , wherein the image area is a predictive frame (P-frame) or a bi-predictive frame (B-frame) reconstructed according to an inter-prediction. 3 . The method of claim 1 , wherein generating the embedding vector includes: generating the embedding vector using an embedding function including an embedding layer and a plurality of fully-connected layers. 4 . The method of claim 3 , wherein the embedding function takes as input all or part of the quantization parameter, a Lagrange multiplier for calculating rate distortion, a temporal layer of the image area, a type of the image area, or any combination thereof. 5 . The method of claim 1 , wherein the denoising model includes a cascaded structure of residual blocks (RBs) and convolutional layers and uses the cascaded structure to generate the filtered image area, and each RB is a convolutional block having a skip path between an input and an output. 6 . The method of claim 5 , wherein generating the filtered image area includes multiplying a feature generated by a preset convolutional layer among the convolutional layers by an absolute value of the embedding vector. 7 . The method of claim 1 , wherein the denoising model includes: a U-net that is a deep learning model configured to generate an offset of a kernel from the image area; a sampler configured to sample the image area by using the offset; convolutional layers configured to generate a calibrated kernel from the image area, an output feature map of the U-net, and a sampled image area; and an output convolutional layer configured to apply convolution to the sampled image area by using the calibrated kernel to generate the filtered image area. 8 . The method of claim 7 , wherein generating the filtered image area includes multiplying the calibrated kernel by an absolute value of the embedding vector. 9 . The method of claim 1 , wherein the denoising model further includes combinatorial convolutional layers, the denoising model generates residual signals between the image area and the filtered image area by using an absolute value of the embedding vector and the combinatorial convolutional layers, and the denoising model sums the residual signals and the filtered image area. 10 . A method performed by a video encoding device for filtering an image area, the method comprising: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and a prediction type information of the image area; and generating a filtered image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers. 11 . The method of claim 10 , wherein obtaining the image area and the quantization parameter comprises: obtaining as the image area a predictive frame (P-frame) or a bi-predictive frame (B-frame) reconstructed according to an inter-prediction. 12 . The method of claim 10 , wherein generating the embedding vector includes: generating the embedding vector using an embedding function including an embedding layer and a plurality of fully-connected layers. 13 . The method of claim 10 , wherein the denoising model includes a cascaded structure of residual blocks (RBs) and convolutional layers and uses the cascaded structure to generate the filtered image area, and each RB is a convolutional block having a skip path between an input and an output. 14 . The method of claim 13 , wherein generating the filtered image area comprises: multiplying a feature generated by a preset convolutional layer among the convolutional layers by an absolute value of the embedding vector. 15 . A method of storing a bitstream of a video into a non-transitory computer-readable recording medium, wherein the bitstream is generated by a video encoding method, and the video encoding method comprises: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and a prediction type information of the image area; and generating a filtered image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers.

Assignees

Inventors

Classifications

  • Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title

  • Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking · CPC title

  • Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title

  • Denoising; Smoothing · CPC title

  • using machine learning, e.g. neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2026039825A1 cover?
A method and an apparatus for video coding using a deep learning-based in-loop filter for inter-prediction are disclosed. The video coding method and the apparatus utilize a deep learning-based in-loop filter for inter-prediction of a predictive frame (P-frame) and a bi-predictive frame (B-frame) in order to mitigate various levels of image distortion according to a QP (quantization parameter) …
Who is the assignee on this patent?
Hyundai Motor Co Ltd, Kia Corp, Univ Ewha Ind Collaboration
What technology area does this patent fall under?
Primary CPC classification H04N19/124. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Feb 05 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).