Multivariate rate control for transcoding video content

US11924449B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11924449-B2
Application numberUS-202017908352-A
CountryUS
Kind codeB2
Filing dateMay 19, 2020
Priority dateMay 19, 2020
Publication dateMar 5, 2024
Grant dateMar 5, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A learning model is trained for rate-distortion behavior prediction against a corpus of a video hosting platform and used to determine optimal bitrate allocations for video data given video content complexity across the corpus of the video hosting platform. Complexity features of the video data are processed using the learning model to determine a rate-distortion cluster prediction for the video data, and transcoding parameters for transcoding the video data are selected based on that prediction. The rate-distortion clusters are modeled during the training of the learning model, such as based on rate-distortion curves of video data of the corpus of the video hosting platform and based on classifications of such video data. This approach minimizes total corpus egress and/or storage while further maintaining uniformity in the delivered quality of videos by the video hosting platform.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for transcoding an input video stream, the method comprising: receiving the input video stream at a server of a video hosting platform, the input video stream including a current video chunk; identifying one or more complexity features of the current video chunk; determining, using a learning model trained based on a corpus of the video hosting platform, a correspondence of the current video chunk to a rate-distortion cluster based on the one or more complexity features; selecting transcoding parameters for the current video chunk based on operating points of a centroid curve of the rate-distortion cluster; and transcoding the current video chunk according to the transcoding parameters, wherein the rate-distortion cluster is one of a plurality of rate-distortion clusters identifiable using the learning model, wherein each rate-distortion cluster of the plurality of rate-distortion clusters corresponds to a different rate-distortion classification of videos of the corpus of the video hosting platform, and wherein determining the correspondence of the current video chunk to the rate-distortion cluster based on the one or more complexity features comprises: predicting that rate-distortion behavior of the current video chunk is similar to rate-distortion behavior of videos used to produce the rate-distortion cluster based on a rate-distortion classification of the current video chunk and based on a rate-distortion classification of video content to which the rate-distortion cluster corresponds. 2. The method of claim 1 , wherein determining the correspondence of the current video chunk to the rate-distortion cluster based on the one or more complexity features comprises: identifying the rate-distortion classification of the current video chunk. 3. The method of claim 1 , further comprising: training the learning model to predict rate-distortion behavior of video data using videos within the corpus of the video hosting platform. 4. The method of claim 3 , wherein training the learning model to predict the rate-distortion behavior of the video data using the videos within the corpus of the video hosting platform comprises: receiving a training data set including training video data from at least some of the videos within the corpus of the video hosting platform; determining rate-distortion curves for the training video data; and producing rate-distortion clusters by clustering the rate-distortion curves based on similarities of complexity features of the training video data. 5. The method of claim 4 , further comprising: determining a centroid curve for each of the rate-distortion clusters, wherein the centroid curve determined for each of the rate-distortion clusters includes a number of operating points, wherein each operating point represents a bitrate available for transcoding and a quality resulting from using the bitrate. 6. The method of claim 5 , wherein selecting the transcoding parameters for the current video chunk based on the operating points of the centroid curve of the rate-distortion cluster comprises: identifying, as an optimal operating point, one of the number of operating points of the centroid curve; and selecting, as the transcoding parameters, parameters corresponding to the optimal operating point. 7. The method of claim 1 , wherein identifying the one or more complexity features of the current video chunk comprises: extracting the one or more complexity features of the current video chunk from a pass log of an encoder used for encoding the input video stream. 8. The method of claim 7 , wherein the pass log is received after a first pass encoding by the encoder, the method further comprising: verifying the selection of the transcoding parameters before a second pass encoding by the encoder. 9. The method of claim 8 , wherein verifying the selection of the transcoding parameters before the second pass encoding by the encoder comprises: determining whether the transcoding of the current video chunk using the transcoding parameters is in accordance with one or more transcoder constraints; and responsive to a determination that the transcoding of the current video chunk using the transcoding parameters is not in accordance with the one or more transcoder constraints, causing a selection of different transcoding parameters for transcoding the current video chunk. 10. An apparatus for transcoding an input video stream, the apparatus comprising: a server of a video hosting platform, the server including a memory and a processor, wherein the processor is configured to execute instructions stored in the memory to: determine one or more complexity features of video data of the input video stream; determine, using a learning model trained based on a corpus of the video hosting platform, a correspondence of the video data to a rate-distortion cluster based on the one or more complexity features of the video data; and transcode the video data according to transcoding parameters selected based on operating points of a centroid curve of the rate-distortion cluster, wherein the rate-distortion cluster is one of a plurality of rate-distortion clusters identifiable using the learning model, wherein each rate-distortion cluster of the plurality of rate-distortion clusters corresponds to a different rate-distortion classification of videos of the corpus of the video hosting platform, and wherein the instructions to determine the correspondence of the video data to the rate-distortion cluster based on the one or more complexity features of the video data include instructions to: predict that rate-distortion behavior of the video data is similar to rate-distortion behavior of videos used to produce the rate-distortion cluster based on a rate-distortion classification of the video data and based on a rate-distortion classification of video content to which the rate-distortion cluster corresponds. 11. The apparatus of claim 10 , wherein the instructions include instructions to: train the learning model to predict rate-distortion behavior using a training data set including at least some videos within the corpus of the video hosting platform. 12. The apparatus of claim 11 , wherein the instructions to train the learning model to predict the rate-distortion behavior using the training data set including the at least some videos within the corpus of the video hosting platform include instructions to: determine rate-distortion curves for video data of the training data set; produce rate-distortion clusters by clustering the rate-distortion curves based on similarities of complexity features of video data of the training data set; and determine a centroid curve for each of the rate-distortion clusters, wherein the centroid curve determined for each of the rate-distortion clusters includes a number of operating points, wherein each operating point represents a bitrate available for transcoding and a quality resulting from using the bitrate. 13. The apparatus of claim 12 , wherein a number of the rate-distortion clusters produced is empirically determined based on variations in rate-distortion characteristics across the corpus of the video hosting platform. 14. The apparatus of claim 10 , wherein the instructions to determine the one or more complexity features of the video data of the input video stream include instructions to: derive the one or more complexity features from an encoder pass log. 15. The apparatus of claim 10 , wherein the instructions to determine the one or more complexity features of the video data of the input video stream include ins

Assignees

Inventors

Classifications

  • H04N19/40Primary

    using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream · CPC title

  • Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks · CPC title

  • according to rate distortion criteria (rate-distortion as a criterion for motion estimation H04N19/567) · CPC title

  • the unit being bits, e.g. of the compressed video stream · CPC title

  • the adaptation method, adaptation tool or adaptation type being iterative or recursive · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11924449B2 cover?
A learning model is trained for rate-distortion behavior prediction against a corpus of a video hosting platform and used to determine optimal bitrate allocations for video data given video content complexity across the corpus of the video hosting platform. Complexity features of the video data are processed using the learning model to determine a rate-distortion cluster prediction for the vide…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/40. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Mar 05 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).