Conditional entropy coding for efficient video compression

US11375194B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11375194-B2
Application numberUS-202017017020-A
CountryUS
Kind codeB2
Filing dateSep 10, 2020
Priority dateNov 16, 2019
Publication dateJun 28, 2022
Grant dateJun 28, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and method for video compression using conditional entropy coding. An ordered sequence of image frames can be transformed to produce an entropy coding for each image frame. Each of the entropy codings provide a compressed form of image information based on a prior image frame and a current image frame (the current image frame occurring after the prior image frame). In this manner, the compression model can capture temporal relationships between image frames or encoded representations of the image frames using a conditional entropy encoder trained to approximate the joint entropy between frames in the image frame sequence.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for encoding a video that comprises at least two image frames having a sequential order, the method comprising: encoding, using an encoder model, a prior image frame of the at least two image frames to generate a first latent representation; encoding, using the encoder model, a current image frame that occurs after the prior image frame based on the sequential order to generate a second latent representation; determining, and using a hyperprior encoder model, a hyperprior code based on the first latent representation and the second latent representation, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame, the prior image frame occurring before the current image frame in the sequential order; one or more of the encoder model and the hyperprior encoder model having been trained with a loss function comprising a term associated with a probability of determining the second latent representation, given the first latent representation and the hyperprior code; determining, using a hyperprior decoder model, one or more conditional probability parameters based on the first latent representation and the hyperprior code; generating, using an entropy coder, an entropy coding of the current image frame based on the one or more conditional probability parameters and the second latent representation; and storing the entropy coding and the hyperprior code. 2. The computer-implemented method of claim 1 , further comprising: encoding, using the encoder model, a third image frame of the at least two image frames that occurs after the current image frame to generate a third latent representation. 3. The computer-implemented method of claim 1 , wherein the current image frame occurs immediately after the prior image frame. 4. The computer-implemented method of claim 1 , further comprising: performing internal learning to optimize the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code. 5. The computer-implemented method of claim 4 , wherein performing internal learning comprises: setting as learnable parameters one or more of the second latent representation, the hyperprior code, or both the second latent representation and the hyperprior code; modifying the learnable parameters to reduce the loss function, the loss function evaluating one or both of: a difference between the current image frame and a decoded image frame generated from the entropy coding of the current image frame; and the probability of determining the second latent representation, given the first latent representation and the hyperprior code. 6. The computer-implemented method of claim 5 , wherein modifying the learnable parameters to reduce the loss function comprises: backpropagating gradients for the learnable parameters over a number of iterations; and updating values for one or more of the learnable parameters at one or more iterations of the number of iterations; wherein during said modifying, all hyperprior decoder model and decoder model parameters are fixed. 7. The computer-implemented method of claim 1 , wherein the hyperprior encoder model comprises a trained neural network. 8. The computer-implemented method of claim 1 , wherein: determining, using the hyperprior encoder model, the hyperprior code is based only on image information included in the first latent representation and the second latent representation. 9. A computer-implemented method for decoding a video that comprises two or more image frames having a sequential order, the method comprising: for the two or more image frames, respectively; obtaining a hyperprior code for a current image frame and a decoded version of a latent representation of a previous sequential image frame, wherein the hyperprior code is indicative of differences between the current image frame and the previous sequential image frame, the previous sequential image frame occurring before the current image frame in the sequential order; determining, using a hyperprior decoder model, one or more conditional probability parameters for the current image frame based at least in part on the hyperprior code for the current image frame and the decoded version of the latent representation of the previous sequential image frame, wherein the hyperprior decoder model has been trained with a loss function comprising a term associated with a probability of determining a latent representation of the current image frame, given the latent representation of the previous sequential image frame and the hyperprior code; decoding, using the one or more conditional probability parameters for the current frame, an entropy code for the current image frame to obtain a decoded version of the latent representation of the current image frame; and providing the decoded version of the latent representation of the current image frame for use in decoding a next entropy code for a next sequential image frame. 10. The computer-implemented method of claim 9 , further comprising: decoding, using a decoder model, the decoded version of a latent representation of the current image frame to obtain a reconstructed version of the current image frame. 11. One or more non-transitory computer-readable media that store: a video compression model, the video compression model comprising: a hyperprior encoder model, the hyperprior encoder model having been trained with a loss function comprising a term associated with a probability of determining a second latent representation, given a first latent representation and a hyperprior code; and a hyperprior decoder model; and instructions for performing encoding comprising: obtaining a video comprising an ordered sequence of image frames; determining a latent representation for at least two sequential image frames in the ordered sequence, wherein the latent representation for the at least two sequential image frames includes the first latent representation associated with a prior image frame and the second latent representation associated with a current image frame; generating the hyperprior code for the at least two sequential image frames by providing the first latent representation and the second latent representation to the hyperprior encoder model, wherein the hyperprior code is indicative of differences between the current image frame and the prior image frame; generating one or more conditional probability parameters for the at least two sequential image frames by providing the hyperprior code associated with the current image frame and the first latent representation to the hyperprior decoder model; and determining an entropy coding for the at least two sequential image frames by providing the conditional probability parameters for the current image frame and the first latent representation associated with the prior image frame to an entropy coder. 12. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more non-transitory computer-readable media further store: an encoder model and a decoder model, and wherein determining the latent representation for the at least two sequential image frames in the ordered sequence comprises: encoding, using the encoder model, the at least two sequential image frames in the ordered sequence. 13. The one or more non-transitory computer-readable media of claim 11 , wherein the one or more non-transitory computer-readable media further store: instructions for performing decoding comprising: obtaining the hyperprior code for the current image frame and a decoded versio

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Supervised learning · CPC title

  • Incoming video signal characteristics or properties · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11375194B2 cover?
Systems and method for video compression using conditional entropy coding. An ordered sequence of image frames can be transformed to produce an entropy coding for each image frame. Each of the entropy codings provide a compressed form of image information based on a prior image frame and a current image frame (the current image frame occurring after the prior image frame). In this manner, the c…
Who is the assignee on this patent?
Uatc Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/91. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).