Machine learning model-based video compression

US12120359B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12120359-B2
Application numberUS-202217704692-A
CountryUS
Kind codeB2
Filing dateMar 25, 2022
Priority dateApr 8, 2021
Publication dateOct 15, 2024
Grant dateOct 15, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system processing hardware executes a machine learning (ML) model-based video compression encoder to receive uncompressed video content and corresponding motion compensated video content, compare the uncompressed and motion compensated video content to identify an image space residual, transform the image space residual to a latent space representation of the uncompressed video content, and transform, using a trained image compression ML model, the motion compensated video content to a latent space representation of the motion compensated video content. The ML model-based video compression encoder further encodes the latent space representation of the image space residual to produce an encoded latent residual, encodes, using the trained image compression ML model, the latent space representation of the motion compensated video content to produce an encoded latent video content, and generates, using the encoded latent residual and the encoded latent video content, a compressed video content corresponding to the uncompressed video content.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a computing platform including a processing hardware and a system memory storing a machine learning (ML) model-based video compression encoder and a trained image compression ML model; the processing hardware configured to execute the ML model-based video compression encoder to: receive an uncompressed video content and a motion compensated video content corresponding to the uncompressed video content; compare the uncompressed video content with the motion compensated video content to identify an image space residual corresponding to the uncompressed video content; transform the image space residual to a latent space representation of the image space residual; receive, using the trained image compression ML model, the motion compensated video content; transform, using the trained image compression ML model, the motion compensated video content to a latent space representation of the motion compensated video content; encode the latent space representation of the image space residual to produce an encoded latent residual; encode, using the trained image compression ML model, the latent space representation of the motion compensated video content to produce an encoded latent video content; and generate, using the encoded latent residual and the encoded latent video content, a compressed video content corresponding to the uncompressed video content. 2. The system of claim 1 , wherein the encoded latent residual and the encoded latent video content are produced in parallel. 3. The system of claim 1 , wherein the processing hardware is configured to execute the ML model-based video compression encoder to generate the compressed video content corresponding to the uncompressed video content based on a difference between the encoded latent residual and the encoded latent video content. 4. The system of claim 1 , wherein the trained image compression ML model comprises a trained artificial neural network (NN). 5. The system of claim 4 , wherein the trained NN is trained using an objective function including an adversarial loss. 6. The system of claim 4 , wherein the trained NN comprises a generative adversarial network (GAN). 7. A method for use by a system including a computing platform having a processing hardware and a system memory storing a machine learning (ML) model-based video compression encoder and a trained image compression ML model, the method comprising: receiving, by the ML model-based video compression encoder executed by the processing hardware, an uncompressed video content and a motion compensated video content corresponding to the uncompressed video content; comparing, by the ML model-based video compression encoder executed by the processing hardware, the uncompressed video content with the motion compensated video content, thereby identifying an image space residual corresponding to the uncompressed video content; transforming, by the ML model-based video compression encoder executed by the processing hardware, the image space residual to a latent space representation of the image space residual; receiving, by the trained image compression ML model executed by the processing hardware, the motion compensated video content; transforming, by the trained image compression ML model executed by the processing hardware, the motion compensated video content to a latent space representation of the motion compensated video content; encoding, by the ML model-based video compression encoder executed by the processing hardware, the latent space representation of the image space residual to produce an encoded latent residual; encoding, by the trained image compression ML model executed by the processing hardware, the latent space representation of the motion compensated video content to produce an encoded video content; and generating, by the ML model-based video compression encoder executed by the processing hardware and using the encoded latent residual and the encoded latent video content, a compressed video content corresponding to the uncompressed video content. 8. The method of claim 7 , wherein the encoded latent residual and the encoded latent video content are produced in parallel. 9. The method of claim 7 , wherein the compressed video content corresponding to the uncompressed video content is generated based on a difference between the encoded latent residual and the encoded latent video content. 10. The method of claim 7 , wherein the trained image compression ML model comprises a trained artificial neural network (NN). 11. The method of claim 10 , wherein the trained NN is trained using an objective function including an adversarial loss. 12. The method of claim 7 , wherein the trained NN comprises a generative adversarial network (GAN). 13. A system comprising: a computing platform including a processing hardware and a system memory storing a machine learning (ML) model-based video compression encoder and a trained image compression ML model; the processing hardware configured to execute the ML model-based video compression encoder to: receive, using the trained image compression ML model, an uncompressed video content and a motion compensated video content corresponding to the uncompressed video content; transform, using the trained image compression ML model, the uncompressed video content to a first latent space representation of the uncompressed video content; transform, using the trained image compression ML model, the motion compensated video content to a second latent space representation of the uncompressed video content; and generate a bitstream for transmitting a compressed video content corresponding to the uncompressed video content based on the first latent space representation and the second latent space representation. 14. The system of claim 13 , wherein the processing hardware is further configured to execute the ML model-based video compression encoder to: determine, using the first latent space representation and the second latent space representation, a latent space residual. 15. The system of claim 14 , wherein the processing hardware is further configured to execute the ML model-based video compression encoder to: generate the bitstream for transmitting the compressed video content corresponding to the uncompressed video content using the latent space residual. 16. The system of claim 14 , wherein the latent space residual is based on a difference between the first latent space representation and the second latent space representation. 17. The system of claim 13 , wherein the transformation of the uncompressed video content to the first latent space representation, and the transformation of the motion compensated video content to the second latent space representation, are performed in parallel. 18. The system of claim 13 , wherein the trained image compression ML model comprises a trained artificial neural network (NN). 19. The system of claim 13 , wherein the trained NN is trained using an objective function including an adversarial loss. 20. The system of claim 13 , wherein the trained NN comprises a generative adversarial network (GAN).

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Adversarial learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12120359B2 cover?
A system processing hardware executes a machine learning (ML) model-based video compression encoder to receive uncompressed video content and corresponding motion compensated video content, compare the uncompressed and motion compensated video content to identify an image space residual, transform the image space residual to a latent space representation of the uncompressed video content, and t…
Who is the assignee on this patent?
Disney Entpr Inc, Eth Zuerich Eidgenoessische Technische Hochschule Zuerich
What technology area does this patent fall under?
Primary CPC classification H04N19/89. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 15 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).