Rate control machine learning models with feedback control for video encoding
US-2023336739-A1 · Oct 19, 2023 · US
US12335486B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12335486-B2 |
| Application number | US-202318096428-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 12, 2023 |
| Priority date | Jan 12, 2023 |
| Publication date | Jun 17, 2025 |
| Grant date | Jun 17, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system includes a processing device to receive video content, metadata related to the video content, and a target bit rate for encoding the video content. The processing device further detects a content type of the video content based on the metadata and encodes hardware to perform frame encoding on the video content. The system further includes a controller coupled between the processing device and the encoding hardware. The controller is programmed with machine instructions to generate first QP values on a per-frame basis using a frame machine learning model with a first plurality of weights. The first plurality of weights depends at least in part on the content type and the target bit rate. The controller further provides the first QP values to the encoding hardware for rate control of the frame encoding.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a processing device to: receive video content, metadata related to the video content, and a target bit rate for encoding the video content; and detect a content type of the video content based on one or more tags within the metadata, wherein the one or more tags are indicative of the content type received from a particular video streaming source device; encoding hardware to perform frame encoding on the video content and to generate frame statistics based on one or more encoded frames of the video content corresponding to a current frame; and a controller coupled between the processing device and the encoding hardware, the controller programmed with machine instructions to: receive the frame statistics from the encoding hardware; generate a first quantization parameter (QP) value of the current frame using a frame machine learning model with a first plurality of weights, wherein the first plurality of weights depends at least in part on the content type, the target bit rate, and the frame statistics; and provide the first QP value directly to the encoding hardware for rate control of the frame encoding. 2. The system of claim 1 , wherein the statistics include one or more of block-related metadata, frame-related metadata, bit budget information, or complexity motion information. 3. The system of claim 1 , wherein corresponding to the current frame comprises one of being adjacent, neighboring frames, frames within a same block, or frames within a same sub-block of the video content. 4. The system of claim 1 , wherein the processing device is further to: retrieve a plurality of parameters related to the detected content type; and provide the plurality of parameters to the controller; and wherein the machine instructions are further to select the first plurality of weights corresponding to the plurality of parameters. 5. The system of claim 1 , wherein the machine learning model is a frame reinforcement learning model that is instantiated in a neural network, wherein the neural network uses the first plurality of weights to maximize a reward function of the neural network while encoding a plurality of frames of the video content. 6. The system of claim 1 , wherein the encoding hardware is further to perform sub-frame encoding and generate sub-frame statistics, and the machine instructions are further to: generate a second QP value of a current sub-frame using a sub-frame machine learning model with a second plurality of weights, wherein the second plurality of weights depends at least in part on the content type, the target bit rate, and the sub-frame statistics; and provide the second QP value to the encoding hardware for rate control of the sub-frame encoding. 7. The system of claim 6 , wherein the encoding hardware is further to encode each respective sub-frame of a plurality of sub-frames using a respective one of a plurality of second QP values. 8. An integrated circuit comprising: encoding hardware to perform frame encoding on video content and to generate frame statistics based on one or more encoded frames of the video content corresponding to a current frame; and a processing device coupled to encoding hardware, wherein the processing device is to implement, using program code, a frame machine learning rate controller that is to: receive the video content, metadata related to the video content, a target bit rate, and the frame statistics for encoding the video content; detect a content type of the video content based on one or more tags within the metadata, wherein the one or more tags are indicative of the content type received from a particular video streaming source device; generate a first quantization parameter (QP) value of the current frame using a frame machine learning model with a first plurality of weights, wherein the first plurality of weights depends at least in part on the content type, the target bit rate, and the frame statistics; and provide the first QP value directly to the encoding hardware for rate control of the frame encoding. 9. The integrated circuit of claim 8 , wherein the frame statistics include one or more of block-related metadata, frame-related metadata, bit budget information, or complexity motion information. 10. The integrated circuit of claim 8 , wherein corresponding to the current frame comprises one of being adjacent, neighboring frames, frames within a same block, or frames within a same sub-block of the video content. 11. The integrated circuit of claim 8 , wherein the processing device is further to: retrieve a plurality of parameters related to the detected content type; and select the first plurality of weights corresponding to the plurality of parameters. 12. The integrated circuit of claim 8 , wherein the machine learning model is a frame reinforcement learning model that is instantiated in a neural network, wherein the neural network uses the first plurality of weights to maximize a reward function of the neural network while encoding a plurality of frames of the video content. 13. The integrated circuit of claim 8 , wherein the encoding hardware is further to perform sub-frame encoding and generate sub-frame statistics, further comprising a controller coupled to the processing device and to execute machine instructions to: generate a second QP value on of a current sub-frame using a sub-frame machine learning model with a second plurality of weights, wherein the second plurality of weights depends at least in part on the content type, the target bit rate, and the sub-frame statistics; and provide the second QP value to the encoding hardware for rate control of the sub-frame encoding. 14. The integrated circuit of claim 8 , wherein the encoding hardware is further to encode each respective sub-frame of a plurality of sub-frames using a respective one of a plurality of second QP values. 15. A method comprising: receiving video content, metadata related to the video content, and a target bit rate for encoding the video content; receiving, by a processing device, from encoding hardware that performs frame encoding, frame statistics based on one or more encoded frames of the video content corresponding to a current frame; detecting a content type of the video content based on one or more tags within the metadata, wherein the one or more tags are indicative of the content type received from a particular video streaming source device; generating, by the processing device, a first quantization parameter (QP) value of the current frame using a frame machine learning model with a first plurality of weights, wherein the first plurality of weights depends at least in part on the content type, the target bit rate, and the frame statistics; and providing, by the processing device, the first QP value directly to encoding hardware for rate control of frame encoding by the encoding hardware. 16. The method of claim 15 , wherein the frame statistics include one or more of block-related metadata, frame-related metadata, bit budget information, or complexity motion information. 17. The method of claim 15 , further comprising: retrieving a plurality of parameters related to the detected content type; and selecting the first plurality of weights corresponding to the plurality of parameters. 18. The method of claim 15 , wherein the machine learning model is a frame reinforcement learning model that is instantiated in a neural network, wherein the neural network uses the first plurality of weights to maximize a reward function of the neural ne
the region being a picture, frame or field · CPC title
Quantisation · CPC title
according to rate distortion criteria (rate-distortion as a criterion for motion estimation H04N19/567) · CPC title
Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion (use of rate-distortion criteria H04N19/147) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.