Visual quality optimized video compression

US10944996B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10944996-B2
Application numberUS-201916544557-A
CountryUS
Kind codeB2
Filing dateAug 19, 2019
Priority dateAug 19, 2019
Publication dateMar 9, 2021
Grant dateMar 9, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques related to providing high perceptual quality video from highly compressed and decompressed reconstructed video are discussed. Such techniques include applying a pretrained decompression upsampling portion of a generative adversarial network to the decompressed reconstructed video to upsample and improve the perceptual quality of the decompressed reconstructed video to generate output video.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a memory to store at least a portion of a received bitstream; and one or more processors coupled to the memory, the one or more processors to: decode the received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion comprises a plurality of first residue blocks followed by a plurality of upsample blocks followed by at least one second residue block; and output the second picture. 2. The apparatus of claim 1 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality of cameras trained on a scene. 3. The apparatus of claim 1 , wherein each of the first and second residue blocks comprises a convolutional layer, a rectified linear unit layer, and a summing layer and each upsample block comprises a transposed convolutional layer and a rectified linear unit layer, and wherein the at least one second residue block is followed by a convolutional layer. 4. The apparatus of claim 1 , the one or more processors further to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator. 5. The apparatus of claim 1 , the one or more processors further to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator; and apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 6. At least one non-transitory machine readable medium comprising a plurality of instructions that, in response to being executed on a computing device, cause the computing device to: decode a received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion comprises a plurality of first residue blocks followed by a plurality of upsample blocks followed by at least one second residue block; and output the second picture. 7. The non-transitory machine readable medium of claim 6 , wherein a ratio of a bitrate of the received bitstream to the first resolution is not more than two bits per second per pixel. 8. The non-transitory machine readable medium of claim 6 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality of cameras trained on a scene. 9. The non-transitory machine readable medium of claim 6 , wherein each of the first and second residue blocks comprises a convolutional layer, a rectified linear unit layer, and a summing layer and each upsample block comprises a transposed convolutional layer and a rectified linear unit layer, and wherein the at least one second residue block is followed by a convolutional layer. 10. The non-transitory machine readable medium of claim 6 , further comprising a plurality of instructions that, in response to being executed on the computing device, cause the computing device to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator. 11. The non-transitory machine readable medium of claim 6 , further comprising a plurality of instructions that, in response to being executed on the computing device, cause the computing device to: receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture, wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator; and apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 12. An apparatus comprising: a memory to store at least a portion of a received bitstream; and one or more processors coupled to the memory, the one or more processors to: decode the received bitstream to generate a first video picture of a decoded video stream, wherein the first video picture comprises a first resolution; receive a first region indicator corresponding to a first region of the first video picture and a second region indicator corresponding to a second region of the first video picture; apply a pretrained decompression upsampling portion of a generative adversarial network to the first video picture to upsample and to improve the perceptual quality of the first video picture to generate a second video picture, wherein the second video picture comprises a second resolution greater than the first resolution, and wherein the pretrained decompression upsampling portion is applied only to the first region in response to the first region indicator and the pretrained decompression upsampling portion is not applied to the second region in response to the second region indicator; and output the second picture. 13. The apparatus of claim 12 , the one or more processors to: apply a second pretrained decompression upsampling portion of a second generative adversarial network to the second region to upsample and apply texture to the second region in response to the second region indicator. 14. The apparatus of claim 13 , wherein a first bitrate of the first region is greater than a second bitrate of the second region in response to the first region being a region of interest. 15. The apparatus of claim 12 , wherein the first region indicator is a foreground region indicator and the second region indicator is a background region indicator. 16. The apparatus of claim 12 , wherein the decoded video stream comprises a first video stream of a plurality of contemporaneous video streams attained from a corresponding plurality o

Assignees

Inventors

Classifications

  • the unit being bits, e.g. of the compressed video stream · CPC title

  • H04N19/154Primary

    Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion (use of rate-distortion criteria H04N19/147) · CPC title

  • Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking · CPC title

  • using neural networks · CPC title

  • the unit being an image region, e.g. an object · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10944996B2 cover?
Techniques related to providing high perceptual quality video from highly compressed and decompressed reconstructed video are discussed. Such techniques include applying a pretrained decompression upsampling portion of a generative adversarial network to the decompressed reconstructed video to upsample and improve the perceptual quality of the decompressed reconstructed video to generate output…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification H04N19/154. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Mar 09 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).