Optimizing video coding using a deep learning model

US12526430B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12526430-B2
Application numberUS-202318235769-A
CountryUS
Kind codeB2
Filing dateAug 18, 2023
Priority dateAug 18, 2023
Publication dateJan 13, 2026
Grant dateJan 13, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for optimizing encoding video frames from a video is provided. In an embodiment, the method comprises receiving a video frame to be encoded. The method further comprises using one or more machine learning models to generate an encoding parameter value for encoding the video frame. The method further comprises comparing a first set of delta encoding values, based on the encoding parameter value, representing differences between groups of pixels of the video frame to a second set of delta encoding values, based on an alternative encoding parameter value, representing differences between the groups of pixels of the video frame, and in response to determining that the first set of delta encoding values is less than the second set of delta encoding values, selecting the encoding parameter value. The method further comprises based on the encoding parameter value, encoding the video frame to generate an encoded video frame.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method for optimizing encoding of video frames from a video, the method comprising: receiving a video frame to be encoded; using one or more machine learning models: generating an encoding parameter value for encoding the video frame; comparing a first set of delta encoding values, based on the encoding parameter value, representing differences between groups of pixels of the video frame to a second set of delta encoding values, based on an alternative encoding parameter value, representing differences between the groups of pixels of the video frame; in response to determining that the first set of delta encoding values is less than the second set of delta encoding values, selecting the encoding parameter value; wherein the one or more machine learning models are trained using a reference encoder; and based on the encoding parameter value, encoding the video frame to generate an encoded video frame. 2 . The computer-implemented method of claim 1 , wherein comparing the first set of delta encoding values to the second set of delta encoding values, comprises: generating the first set of delta encoding values based on the encoding parameter value; selecting the alternative encoding parameter value for encoding the video frame; generating the second set of delta encoding values based on the alternative encoding parameter value; and comparing the first set of delta encoding values to the second set of delta encoding values. 3 . The computer-implemented method of claim 1 , wherein generating the encoding parameter value for encoding the video frame, comprises: using the one or more machine learning models to generate a set of candidate encoding parameter values; wherein each candidate encoding parameter value of the set of candidate encoding parameter values represents one type of encoding method, wherein types of encoding methods comprise intra-block encoding, inter-block encoding, intra-block-copy encoding, and palette encoding; for each candidate encoding parameter value from the set of candidate encoding value, generating sets of delta encoding values; determining, from the sets of delta encoding values, a particular set of delta encoding values with lowest delta encoding values; and selecting the encoding parameter value that corresponds to the particular set of delta encoding values. 4 . The computer-implemented method of claim 1 , wherein the one or more machine learning models are convolutional neural network models. 5 . The computer-implemented method of claim 1 , wherein the one or more machine learning models are transformer neural network models. 6 . The computer-implemented method of claim 1 , wherein the encoding parameter value is a parameter value for at least one of a block prediction and block transformation. 7 . The computer-implemented method of claim 1 , wherein the one or more machine learning models are further trained training using bit-string lengths. 8 . A non-transitory, computer-readable medium storing a set of instructions for optimizing encoding of video frames, that, when executed by a processor, cause: receiving a video frame to be encoded; using one or more machine learning models: generating an encoding parameter value for encoding the video frame; comparing a first set of delta encoding values, based on the encoding parameter value, representing differences between groups of pixels of the video frame to a second set of delta encoding values, based on an alternative encoding parameter value, representing differences between the groups of pixels of the video frame; in response to determining that the first set of delta encoding values is less than the second set of delta encoding values, selecting the encoding parameter value; wherein the one or more machine learning models are trained using a reference encoder; and based on the encoding parameter value, encoding the video frame to generate an encoded video frame. 9 . The non-transitory, computer-readable medium of claim 8 , wherein comparing the first set of delta encoding values to the second set of delta encoding values, comprises: generating the first set of delta encoding values based on the encoding parameter value; selecting the alternative encoding parameter value for encoding the video frame; generating the second set of delta encoding values based on the alternative encoding parameter value; and comparing the first set of delta encoding values to the second set of delta encoding values. 10 . The non-transitory, computer-readable medium of claim 8 , wherein generating the encoding parameter value for encoding the video frame, comprises: using the one or more machine learning models to generate a set of candidate encoding parameter values; wherein each candidate encoding parameter value of the set of candidate encoding parameter values represents one type of encoding method, wherein types of encoding methods comprise intra-block encoding, inter-block encoding, intra-block-copy encoding, and palette encoding; for each candidate encoding parameter value from the set of candidate encoding value, generating sets of delta encoding values; determining, from the sets of delta encoding values, a particular set of delta encoding values with lowest delta encoding values; and selecting the encoding parameter value that corresponds to the particular set of delta encoding values. 11 . The non-transitory, computer-readable medium of claim 8 , wherein the one or more machine learning models are convolutional neural network models. 12 . The non-transitory, computer-readable medium of claim 8 , wherein the one or more machine learning models are transformer neural network models. 13 . The non-transitory, computer-readable medium of claim 8 , wherein the encoding parameter value is a parameter value for at least one of a block prediction and block transformation. 14 . The non-transitory, computer-readable medium of claim 8 , wherein the one or more machine learning models are further trained using bit-string lengths. 15 . A network-based system for optimizing encoding of video frames, the system comprising: a processor; a memory operatively connected to the processor and storing instructions that, when executed by the processor, cause: receiving a video frame to be encoded; using one or more machine learning models: generating an encoding parameter value for encoding the video frame; comparing a first set of delta encoding values, based on the encoding parameter value, representing differences between groups of pixels of the video frame to a second set of delta encoding values, based on an alternative encoding parameter value, representing differences between the groups of pixels of the video frame; in response to determining that the first set of delta encoding values is less than the second set of delta encoding values, selecting the encoding parameter value; wherein the one or more machine learning models are trained using a reference encoder; and based on the encoding parameter value, encoding the video frame to generate an encoded video frame. 16 . The system of claim 15 , wherein comparing the first set of delta encoding values to the second set of delta encoding values, comprises: generating the first set of delta encoding values based on the encoding parameter value; selecting the alternative encoding parameter value for encoding the video frame; generating the second set of delta encoding values based on the alternative encoding parameter value; and comparing the first set of delta encoding v

Assignees

Inventors

Classifications

  • H04N19/136Primary

    Incoming video signal characteristics or properties · CPC title

  • the region being a picture, frame or field · CPC title

  • H04N19/146Primary

    Data rate or code amount at the encoder output · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12526430B2 cover?
A computer-implemented method for optimizing encoding video frames from a video is provided. In an embodiment, the method comprises receiving a video frame to be encoded. The method further comprises using one or more machine learning models to generate an encoding parameter value for encoding the video frame. The method further comprises comparing a first set of delta encoding values, based on…
Who is the assignee on this patent?
Ringcentral Inc
What technology area does this patent fall under?
Primary CPC classification H04N19/136. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 13 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).