Codec rate distortion compensating downsampler

US12278969B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12278969-B2
Application numberUS-202318230409-A
CountryUS
Kind codeB2
Filing dateAug 4, 2023
Priority dateOct 13, 2021
Publication dateApr 15, 2025
Grant dateApr 15, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system includes a machine learning (ML) model-based video downsampler configured to receive an input video sequence having a first display resolution, and to map the input video sequence to a lower resolution video sequence having a second display resolution lower than the first display resolution. The system also includes a neural network-based (NN-based) proxy video codec configured to transform the lower resolution video sequence into a decoded proxy bitstream. In addition, the system includes an upsampler configured to produce an output video sequence using the decoded proxy bitstream.

First claim

Opening claim text (preview).

What is claimed is: 1. A video processing system comprising: an upsampler; a video codec; a trained machine learning (ML) model-based video downsampler trained using a neural network-based (NN-based) proxy video codec; and a processing hardware configured to: receive an input video sequence having a first display resolution; extract a content sample of the input video sequence; map, using the trained ML model-based video downsampler, the content sample to a lower resolution sample; transform, using one of the video codec or the NN-based proxy video codec, the lower resolution sample into a decoded sample bitstream; predict, using the upsampler and the decoded sample bitstream, an output sample corresponding to the content sample; and modify, based on the predicted output sample, one or more parameters of the trained ML model-based video downsampler; wherein the ML model-based video downsampler is trained using the input video sequence, the output sample, and an objective function based on an estimated rate of the lower resolution sample and a plurality of perceptual loss functions. 2. The video processing system of claim 1 , wherein the NN-based proxy video codec is differentiable. 3. The video processing system of claim 1 , wherein modifying the one or more parameters of the trained ML model-based video downsampler renders the trained ML model-based video downsampler content adaptive. 4. The video processing system of claim 1 , wherein the trained ML model-based video downsampler is configured to support arbitrary scaling factors. 5. The system of claim 1 , wherein the objective function comprises the estimated rate of the lower resolution sample in combination with a weighted sum of the plurality of perceptual loss functions. 6. The system of claim 5 , wherein the ML model-based video downsampler is further configured to receive a plurality of weighting factors included in the weighted sum of the plurality of perceptual loss functions, and wherein the ML model-based video downsampler is trained further using the plurality of weighting factors. 7. The video processing system of claim 1 , further comprising a simulation module including the upsampler. 8. The video processing system of claim 1 , wherein the upsampler comprises an ML model-based upsampler. 9. The video processing system of claim 8 , wherein the ML model-based upsampler and the ML model-based video downsampler are trained concurrently. 10. A method for use by a video processing system including an upsampler, a video codec, and a trained machine learning (ML) model-based video downsampler trained using a neural network-based (NN-based) proxy video codec, the method comprising: receiving an input video sequence having a first display resolution; extracting a content sample of the input video sequence; mapping, using the trained ML model-based video downsampler, the content sample to a lower resolution sample; transforming, using one of the video codec or the NN-based proxy video codec, the lower resolution sample into a decoded sample bitstream; predicting, using the upsampler and the decoded sample bitstream, an output sample corresponding to the content sample; and modifying, based on the predicted output sample, one or more parameters of the trained ML model-based video downsampler; wherein the ML model-based video downsampler is trained using the input video sequence, the output sample, and an objective function based on an estimated rate of the lower resolution sample and a plurality of perceptual loss functions. 11. The method of claim 10 , wherein the NN-based proxy video codec is differentiable. 12. The method of claim 10 , wherein modifying the one or more parameters of the trained ML model-based video downsampler renders the trained ML model-based video downsampler content adaptive. 13. The method of claim 10 , wherein the trained ML model-based video downsampler is configured to support arbitrary scaling factors. 14. The method of claim 10 , wherein the objective function comprises the estimated rate of the lower resolution sample in combination with a weighted sum of the plurality of perceptual loss functions. 15. The method of claim 14 , wherein the ML model-based video downsampler is further configured to receive a plurality of weighting factors included in the weighted sum of the plurality of perceptual loss functions, and wherein the ML model-based video downsampler is trained further using the plurality of weighting factors. 16. The method of claim 10 , wherein the upsampler is included in a simulation module. 17. The method of claim 10 , wherein the upsampler comprises an ML model-based upsampler. 18. The method of claim 17 , wherein the ML model-based upsampler and the ML model-based video downsampler are trained concurrently.

Assignees

Inventors

Classifications

  • using neural networks · CPC title

  • using neural networks · CPC title

  • Learning methods · CPC title

  • the unit being bits, e.g. of the compressed video stream · CPC title

  • H04N19/132Primary

    Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12278969B2 cover?
A system includes a machine learning (ML) model-based video downsampler configured to receive an input video sequence having a first display resolution, and to map the input video sequence to a lower resolution video sequence having a second display resolution lower than the first display resolution. The system also includes a neural network-based (NN-based) proxy video codec configured to tran…
Who is the assignee on this patent?
Disney Entpr Inc, Eth Zurich Eidgenoessische Technische Hochschule Zurich, Eth Zurich Eidgenossische Technische Hochschule Zurich
What technology area does this patent fall under?
Primary CPC classification H04N19/132. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Apr 15 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).