Collusion attack prevention
US-2024362739-A1 · Oct 31, 2024 · US
US2026039825A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026039825-A1 |
| Application number | US-202519352059-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 7, 2025 |
| Priority date | Mar 31, 2021 |
| Publication date | Feb 5, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and an apparatus for video coding using a deep learning-based in-loop filter for inter-prediction are disclosed. The video coding method and the apparatus utilize a deep learning-based in-loop filter for inter-prediction of a predictive frame (P-frame) and a bi-predictive frame (B-frame) in order to mitigate various levels of image distortion according to a QP (quantization parameter) value present in the P-frame and the B-frame.
Opening claim text (preview).
What is claimed is: 1 . A method for filtering an image area performed by a video decoding device, the method comprising: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and prediction type information of the image area; and generating a filtered image area of the image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers. 2 . The method of claim 1 , wherein the image area is a predictive frame (P-frame) or a bi-predictive frame (B-frame) reconstructed according to an inter-prediction. 3 . The method of claim 1 , wherein generating the embedding vector includes: generating the embedding vector using an embedding function including an embedding layer and a plurality of fully-connected layers. 4 . The method of claim 3 , wherein the embedding function takes as input all or part of the quantization parameter, a Lagrange multiplier for calculating rate distortion, a temporal layer of the image area, a type of the image area, or any combination thereof. 5 . The method of claim 1 , wherein the denoising model includes a cascaded structure of residual blocks (RBs) and convolutional layers and uses the cascaded structure to generate the filtered image area, and each RB is a convolutional block having a skip path between an input and an output. 6 . The method of claim 5 , wherein generating the filtered image area includes multiplying a feature generated by a preset convolutional layer among the convolutional layers by an absolute value of the embedding vector. 7 . The method of claim 1 , wherein the denoising model includes: a U-net that is a deep learning model configured to generate an offset of a kernel from the image area; a sampler configured to sample the image area by using the offset; convolutional layers configured to generate a calibrated kernel from the image area, an output feature map of the U-net, and a sampled image area; and an output convolutional layer configured to apply convolution to the sampled image area by using the calibrated kernel to generate the filtered image area. 8 . The method of claim 7 , wherein generating the filtered image area includes multiplying the calibrated kernel by an absolute value of the embedding vector. 9 . The method of claim 1 , wherein the denoising model further includes combinatorial convolutional layers, the denoising model generates residual signals between the image area and the filtered image area by using an absolute value of the embedding vector and the combinatorial convolutional layers, and the denoising model sums the residual signals and the filtered image area. 10 . A method performed by a video encoding device for filtering an image area, the method comprising: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and a prediction type information of the image area; and generating a filtered image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers. 11 . The method of claim 10 , wherein obtaining the image area and the quantization parameter comprises: obtaining as the image area a predictive frame (P-frame) or a bi-predictive frame (B-frame) reconstructed according to an inter-prediction. 12 . The method of claim 10 , wherein generating the embedding vector includes: generating the embedding vector using an embedding function including an embedding layer and a plurality of fully-connected layers. 13 . The method of claim 10 , wherein the denoising model includes a cascaded structure of residual blocks (RBs) and convolutional layers and uses the cascaded structure to generate the filtered image area, and each RB is a convolutional block having a skip path between an input and an output. 14 . The method of claim 13 , wherein generating the filtered image area comprises: multiplying a feature generated by a preset convolutional layer among the convolutional layers by an absolute value of the embedding vector. 15 . A method of storing a bitstream of a video into a non-transitory computer-readable recording medium, wherein the bitstream is generated by a video encoding method, and the video encoding method comprises: obtaining an image area having been reconstructed and a quantization parameter of the image area; generating an embedding vector based on the quantization parameter and a prediction type information of the image area; and generating a filtered image area based on the embedding vector by using a denoising model that is based on deep learning, wherein the prediction type information includes at least one of an intra prediction type in which the image area is predicted independently, a predictive type in which the image area is predicted based on a reference image area in a single direction, or a bi-predictive type in which the image area is predicted based on at least one reference image area in bi-directions, and wherein the denoising model includes one or more convolution layers.
Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title
Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking · CPC title
Feature selection, e.g. selecting representative features from a multi-dimensional feature space · CPC title
Denoising; Smoothing · CPC title
using machine learning, e.g. neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.