System and method for video coding
US-2020084452-A1 · Mar 12, 2020 · US
US12136186B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12136186-B2 |
| Application number | US-202217687298-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 4, 2022 |
| Priority date | Sep 6, 2019 |
| Publication date | Nov 5, 2024 |
| Grant date | Nov 5, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Example image processing methods and apparatuses are disclosed. One example image processing method includes obtaining a video stream, where the video stream includes a first frame of image, a second frame of image, and a third frame of image that are adjacent in time sequence. The video stream can then be decoded to obtain a first alignment frame, a second alignment frame, and at least one residual between the first frame of image, the second frame of image, and the third frame of image. At least one residual frame can then be generated based on the at least one residual. Super resolution processing can then be performed on the second frame of image based on the at least one residual frame, the first alignment frame, and the second alignment frame to obtain an additional second frame of image obtained after super resolution.
Opening claim text (preview).
What is claimed is: 1. An image processing method, wherein the method comprises: obtaining a video stream, wherein the video stream comprises a first frame of image, a second frame of image, and a third frame of image that are adjacent in time sequence; decoding the video stream to obtain a first alignment frame, a second alignment frame, and at least one residual between the first frame of image, the second frame of image, and the third frame of image, wherein the first alignment frame is generated after the first frame of image moves a pixel block towards the second frame of image based on a first motion vector, the second alignment frame is generated after the third frame of image moves a pixel block towards the second frame of image based on a second motion vector, and each residual of the at least one residual is a pixel difference between a first macroblock in a previous frame of image and a second macroblock corresponding to the first macroblock in a subsequent frame of image after the previous frame of image performs motion compensation towards the subsequent frame of image based on a motion vector; generating at least one residual frame based on the at least one residual; and performing super resolution processing on the second frame of image based on the at least one residual frame, the first alignment frame, and the second alignment frame to obtain an additional second frame of image obtained after super resolution, wherein performing the super resolution processing comprises: generating high-frequency information based on the at least one residual frame; generating a luminance channel based on the first alignment frame, the second alignment frame, and the second frame of image; and merging the high-frequency information with the luminance channel to generate the additional second frame of image. 2. The method according to claim 1 , wherein generating the at least one residual frame based on the at least one residual comprises: generating a first residual frame based on a first residual, wherein the first residual and the first alignment frame satisfy the following relationship: I i ( t 2 ) = I i - T i ( t 2 ) ( t 1 ) + Δ i ( t 2 ) , wherein I i (t2) represents the first alignment frame, I i - T i ( t 2 ) ( t 1 ) represents the second frame of image, Δ i (t2) represents the first residual, i represents a macroblock in the first frame of image, i−T i (t2) represents a macroblock obtained after the macroblock i moves based on a motion vector T i (t2) corresponding to the macroblock i, t1 represents a generation moment of the first frame of image, and t2 represents a generation moment of the second frame of image. 3. The method according to claim 1 , wherein performing the super resolution processing comprises: inputting the at least one residual frame to a neural network for feature extraction to obtain at least one first feature map; inputting the first alignment frame, the second alignment frame, and the second frame of image to the neural network for feature extraction to obtain at least one second feature map; inputting the at least one first feature map to a first super resolution network for processing to generate the high-frequency information; and inputting the at least one second feature map to a second super resolution network for processing to generate the luminance channel. 4. The method according to claim 3 , before inputting the at least one residual frame to the neural network for feature extraction, further comprising: determining a macroblock in a region of interest in a first residual frame, wherein the macroblock in the region of interest is a macroblock that is in a current macroblock and in which a sum of all pixel values exceeds a preset value; and determining a region of interest in a remaining residual frame other than the first residual frame in the at least one residual frame based on the macroblock in the region of interest in the first residual frame, wherein inputting the at least one residual frame to the neural network for feature extraction comprises: inputting macroblocks in all regions of interest in the at least one residual frame to the neural network for feature extraction, wherein the at least one residual frame comprises the first residual frame and the remaining residual frame. 5. The method according to claim 4 , wherein inputting the first alignment frame, the second alignment frame, and the second frame of image to the neural network for feature extraction comprises: inputting macroblocks in regions of interest in the first alignment frame and the second alignment frame, and the second frame of image to the neural network for feature extraction, wherein each of the regions of interest in the first alignment frame and the second alignment frame is the same as the region of interest in the first residual frame. 6. The method according to claim 1 , wherein: the first frame of image, the second frame of image, and the third frame of image are three frames of images in a first group of pictures
using pre-processing or post-processing specially adapted for video compression · CPC title
Processing of motion vectors · CPC title
the region being a block, e.g. a macroblock · CPC title
based on super-resolution, i.e. the output image resolution being higher than the sensor resolution · CPC title
Image enhancement or restoration · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.