Neural network-based intra prediction for video encoding or decoding

US12335539B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12335539-B2
Application numberUS-202117800002-A
CountryUS
Kind codeB2
Filing dateJan 29, 2021
Priority dateFeb 21, 2020
Publication dateJun 17, 2025
Grant dateJun 17, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A video coding system is provided that performs intra prediction in a mode using a neural network for block of only a set of specific block sizes. The signaling of this mode is designed to be efficient in terms of rate-distortion under this constraint. Different transformations of the context of a block and the neural network prediction of this block are introduced in order to use one single neural network for predicting blocks of several sizes, as well as the corresponding signaling. The neural network-based prediction mode considers both luminance blocks and chrominance blocks. The video coding system comprises encoder and decoder apparatuses, encoding, decoding and signal generation methods and a signal carrying information corresponding to the described coding mode.

First claim

Opening claim text (preview).

The invention claimed is: 1. A video encoding method comprising: performing intra prediction for at least one block in a picture or video using a neural network based intra prediction by feeding a block context into a neural network selected based on a size of the at least one block, the block context comprising pixels of blocks located at a top side, at a left side, at a diagonal top left side, at a diagonal top right side and at a diagonal bottom left side of the at least one block, wherein the size of the block context is based on the size of the at least one block, and wherein the block context comprises n l columns and n a rows, wherein n l and n a are selected as: n a = α ⁢ H , n l = β ⁢ W , α ∈ 〚 1 4 , 1 2 , 3 4 , 1 , 2 〛 , β ∈ 〚 1 4 , 1 2 , 3 4 , 1 , 2 〛 where H is a height of the block and W is a width of the block; generating signaling information representative that the intra prediction mode is a neural network based intra prediction; and encoding at least information representative of the at least one block and the neural network-based intra prediction mode. 2. The method of claim 1 , wherein the signaling information is encoded in a bitstream and comprises a flag indicating that a neural network-based intra prediction mode is selected for the at least one block, the flag being based on a set of flags representing a plurality of intra prediction modes arranged in a binary tree for being encoded in a bitstream and wherein the flag indicating that neural network-based intra prediction mode is selected is located at a first level of the tree and encoded with a single bit. 3. A video decoding method comprising: obtaining, for at least one block in a picture or video, at least information representative that an intra prediction is a neural network-based prediction and a block context, the block context comprising pixels of blocks located at a top side, at a left side, at a diagonal top left side, at a diagonal top right side and at a diagonal bottom left side of the at least one block, wherein the size of the block context is based on the size of the at least one block, and wherein the block context comprises n l columns and n a rows, wherein n l and n a are selected as: n a = α ⁢ H , n l = β ⁢ W , α ∈ 〚 1 4 , 1 2 , 3 4 , 1 , 2 〛 , β ∈ 〚 1 4 , 1 2 , 3 4 , 1 , 2 〛 where H is a height of the block and W is a width of the block; and performing intra prediction for the at least one block in a picture or video by feeding the block context into a neural network based on the size of the at least one block. 4. The method of claim 3 , wherein the neural network-based intra prediction is performed based on a position of the at least one block. 5. The method of claim 3 , wherein the block context is down-sampled prior to performing the intra prediction and the at least one predicted block resulting from the neural network based intra prediction is interpolated after the intra prediction. 6. The method of claim 3 , wherein the block context is transposed prior to performing the intra prediction and the at least one predicted block resulting from the neural network based intra prediction is transposed back after the intra prediction. 7. The method of claim 3 , wherein the block context is down-sampled and transposed prior to performing the intra prediction and the at least one predicted block resulting from the neural network based intra prediction is transposed back and interpolated after the intra prediction. 8. The method of claim 3 , wherein the neural network-based intra prediction is done in both luminance and chrominance of the at least one block. 9. An apparatus, comprising an encoder for encoding a current block in a picture or video wherein the encoder is configured to: perform intra prediction for at least one block in a picture or video using a neural network based intra prediction by feeding a block context into a neural network selected based on a size of the at least one block, the block context comprising pixels of blocks located at a top side, at a left side, at a diagonal top left side, at a diagonal top right side and at a diagonal bottom left side of the at least one block, wherein the size of the block context is based on the size of

Assignees

Inventors

Classifications

  • Embedding additional information in the video signal during the compression process (H04N19/517, H04N19/68, H04N19/70 take precedence) · CPC title

  • H04N19/176Primary

    the region being a block, e.g. a macroblock · CPC title

  • Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction · CPC title

  • using neural networks · CPC title

  • in combination with predictive coding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12335539B2 cover?
A video coding system is provided that performs intra prediction in a mode using a neural network for block of only a set of specific block sizes. The signaling of this mode is designed to be efficient in terms of rate-distortion under this constraint. Different transformations of the context of a block and the neural network prediction of this block are introduced in order to use one single ne…
Who is the assignee on this patent?
Interdigital Madison Patent Holdings Sas
What technology area does this patent fall under?
Primary CPC classification H04N19/176. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 17 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).