Content-adaptive online training method and apparatus for post-filtering
US-2022385896-A1 · Dec 1, 2022 · US
US12413721B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12413721-B2 |
| Application number | US-202418656413-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 6, 2024 |
| Priority date | May 27, 2021 |
| Publication date | Sep 9, 2025 |
| Grant date | Sep 9, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Aspects of the disclosure provide methods and apparatuses, for video decoding and encoding. The apparatus includes processing circuitry configured to receive an image/video comprising one or more blocks and metadata for a machine task associated with the image/video. The metadata specifies neural network post-filtering characteristics for machine consumption. The processing circuitry decodes a first post-filtering parameter in the image/video corresponding to the one or more blocks to be reconstructed. The first post-filtering parameter applies to a block in the one or more blocks and has been updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and the metadata. The processing circuitry determines the post-filtering NN in a video decoder corresponding to the one or more blocks based on the first post-filtering parameter, and decodes the block based on the determined post-filtering NN corresponding to the block and the metadata.
Opening claim text (preview).
What is claimed is: 1. A method for video decoding in a video decoder, comprising: receiving one of an image and a video comprising one or more blocks and metadata for a machine task associated with the one of the image and the video, the metadata specifying neural network post-filtering characteristics for machine consumption; decoding a first post-filtering parameter in the one of the image and the video corresponding to the one or more blocks to be reconstructed, the first post-filtering parameter applying to at least one of the one or more blocks, and the first post-filtering parameter having been updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and the metadata for the machine task associated with the one of the image and the video; determining the post-filtering NN in the video decoder corresponding to the one or more blocks based on the first post-filtering parameter; and decoding the one or more blocks based on the determined post-filtering NN corresponding to the one or more blocks and the metadata for the machine task associated with the one of the image and the video. 2. The method of claim 1 , further comprising: receiving the metadata for the machine task associated with the one of the image and the video in a Supplemental Enhancement Information (SEI) message. 3. The method of claim 2 , wherein the SEI message comprises a neural-network post-filter (NNPF) characteristics (NNPFC) SEI message, the NNPFC SEI message specifying characteristics of the post-filtering NN to be used to filter pictures after decoding for the machine task. 4. The method of claim 2 , wherein the SEI message comprises a neural-network post-filter activation (NNPFA) SEI message, the NNPFA SEI message specifying whether to enable a post-filter referenced for the one of the image and the video. 5. The method of claim 2 , wherein the SEI message comprises an annotated regions SEI message to signal parameters that identify annotated regions using bounding boxes representing a size and a location of a detected object in the one of the image and the video. 6. The method of claim 5 , wherein the bounding boxes representing the size and the location of the detected object comprises a rectangular bounding box of the detected object in the one of the image and the video. 7. The method of claim 2 , wherein the SEI message comprises an object mask information (OMI) SEI message to signal information indicating a shape of a detected object in the one of the image and the video. 8. The method of claim 2 , wherein the SEI message comprises an encoder optimization information (EOI) SEI message to indicate whether the one of the image and the video has been optimized for machine analysis and which types of optimizations have been applied in pre-processing or encoding. 9. A method for video encoding in a video encoder, comprising: determining a first post-filtering parameter corresponding to one or more blocks to be encoded, the first post-filtering parameter applying to at least one of the one or more blocks, and the first post-filtering parameter having been updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and metadata for a machine task associated with one of an image and a video that includes the one or more blocks, the metadata specifying neural network post-filtering characteristics for machine consumption; and encoding the first post-filtering parameter and the metadata for the machine task in the one of the image and the video. 10. The method of claim 9 , further comprising signaling the metadata for the machine task associated with the one of the image and the video in a Supplemental Enhancement Information (SEI) message. 11. The method of claim 10 , wherein the SEI message comprises a neural-network post-filter (NNPF) characteristics (NNPFC) SEI message, the NNPFC SEI message specifying characteristics of the post-filtering NN to be used to filter pictures after decoding for the machine task. 12. The method of claim 10 , wherein the SEI message comprises a neural-network post-filter activation (NNPFA) SEI message, the NNPFA SEI message specifying whether to enable a post-filter referenced for the one of the image and the video. 13. The method of claim 10 , wherein the SEI message comprises an annotated regions SEI message to signal parameters that identify annotated regions using bounding boxes representing a size and a location of a detected object in the one of the image and the video. 14. The method of claim 13 , wherein the bounding boxes representing the size and the location of the detected object comprises a rectangular bounding box of the detected object in the one of the image and the video. 15. The method of claim 10 , wherein the SEI message comprises an object mask information (OMI) SEI message to signal information indicating a shape of a detected object in the one of the image and the video. 16. The method of claim 10 , wherein the SEI message comprises an encoder optimization information (EOI) SEI message to indicate whether the one of the image and the video has been optimized for machine analysis and which types of optimizations have been applied in pre-processing or encoding. 17. A method of processing visual media data, the method comprising: processing a bitstream that includes the visual media data according to a format rule, wherein the bitstream includes metadata and a first post-filtering parameter; and the format rule specifies that one of an image and a video includes one or more blocks and the metadata for a machine task associated with the one of the image and the video, the metadata specifies neural network post-filtering characteristics for machine consumption; a first post-filtering parameter in the one of the image and the video corresponding to the one or more blocks to be reconstructed is decoded, the first post-filtering parameter applies to at least one of the one or more blocks, the first post-filtering parameter is updated by a post-filtering module in a post-filtering neural network (NN) that is trained based on a training dataset and the metadata for the machine task associated with the one of the image and the video; the post-filtering NN in a video decoder corresponding to the one or more blocks is determined based on the first post-filtering parameter; and the one or more blocks is decoded based on the determined post-filtering NN corresponding to the one or more blocks and the metadata for the machine task associated with the one of the image and the video. 18. The method of claim 17 , wherein the format rule specifies that: the metadata for the machine task associated with the one of the image and the video is received in a Supplemental Enhancement Information (SEI) message. 19. The method of claim 18 , wherein the SEI message comprises a neural-network post-filter (NNPF) characteristics (NNPFC) SEI message, the NNPFC SEI message specifying characteristics of the post-filtering NN to be used to filter pictures after decoding for the machine task. 20. The method of claim 18 , wherein the SEI message comprises a neural-network post-filter activation (NNPFA) SEI message, the NNPFA SEI message specifying whether to enable a post-filter referenced for the one of the image and the video.
characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation (H04N19/635 takes precedence) · CPC title
the region being a block, e.g. a macroblock · CPC title
Incoming video signal characteristics or properties · CPC title
characterised by syntax aspects related to video coding, e.g. related to compression standards · CPC title
Embedding additional information in the video signal during the compression process (H04N19/517, H04N19/68, H04N19/70 take precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.