Encoder, decoder, encoding method, and decoding method
US-2020059669-A1 · Feb 20, 2020 · US
US10944996B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10944996-B2 |
| Application number | US-201916544557-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 19, 2019 |
| Priority date | Aug 19, 2019 |
| Publication date | Mar 9, 2021 |
| Grant date | Mar 9, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques related to providing high perceptual quality video from highly compressed and decompressed reconstructed video are discussed. Such techniques include applying a pretrained decompression upsampling portion of a generative adversarial network to the decompressed reconstructed video to upsample and improve the perceptual quality of the decompressed reconstructed video to generate output video.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a memory to store at least a portion of a received bitstream; and one or more processors coupled to the memory, the one or more processors to: decode the received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion comprises a plurality of first residue blocks followed by a plurality of upsample blocks followed by at least one second residue block; and output the second picture. 2. The apparatus of claim 1 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality of cameras trained on a scene. 3. The apparatus of claim 1 , wherein each of the first and second residue blocks comprises a convolutional layer, a rectified linear unit layer, and a summing layer and each upsample block comprises a transposed convolutional layer and a rectified linear unit layer, and wherein the at least one second residue block is followed by a convolutional layer. 4. The apparatus of claim 1 , the one or more processors further to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator. 5. The apparatus of claim 1 , the one or more processors further to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator; and apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 6. At least one non-transitory machine readable medium comprising a plurality of instructions that, in response to being executed on a computing device, cause the computing device to: decode a received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion comprises a plurality of first residue blocks followed by a plurality of upsample blocks followed by at least one second residue block; and output the second picture. 7. The non-transitory machine readable medium of claim 6 , wherein a ratio of a bitrate of the received bitstream to the first resolution is not more than two bits per second per pixel. 8. The non-transitory machine readable medium of claim 6 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality of cameras trained on a scene. 9. The non-transitory machine readable medium of claim 6 , wherein each of the first and second residue blocks comprises a convolutional layer, a rectified linear unit layer, and a summing layer and each upsample block comprises a transposed convolutional layer and a rectified linear unit layer, and wherein the at least one second residue block is followed by a convolutional layer. 10. The non-transitory machine readable medium of claim 6 , further comprising a plurality of instructions that, in response to being executed on the computing device, cause the computing device to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator. 11. The non-transitory machine readable medium of claim 6 , further comprising a plurality of instructions that, in response to being executed on the computing device, cause the computing device to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator; and apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 12. An apparatus comprising: a memory to store at least a portion of a received bitstream; and one or more processors coupled to the memory, the one or more processors to: decode the received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and to improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator; and output the second picture. 13. The apparatus of claim 12 , the one or more processors to: apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 14. The apparatus of claim 13 , wherein a first bitrate of the first region is greater than a second bitrate of the second region in response to the first region being a region of interest. 15. The apparatus of claim 12 , wherein the first region indicator is a foreground region indicator and the second region indicator is a background region indicator. 16. The apparatus of claim 12 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality o
the unit being bits, e.g. of the compressed video stream · CPC title
Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion (use of rate-distortion criteria H04N19/147) · CPC title
Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking · CPC title
using neural networks · CPC title
the unit being an image region, e.g. an object · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.