Feature-domain residual for video coding for machines

US11375204B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11375204-B2
Application numberUS-202117218967-A
CountryUS
Kind codeB2
Filing dateMar 31, 2021
Priority dateApr 7, 2020
Publication dateJun 28, 2022
Grant dateJun 28, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; decode encoded residual features to generate decoded residual features; and generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: downsample original data to a lower resolution prior to encoding the original data; encode the downsampled original data with a first codec to generate encoded data with a bitrate lower than that of the original data, and decode the encoded data to generate decoded data; encode the original data with at least one second learned codec to generate encoded residual features and decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with features extracted from the decoded data generated with the first codec; wherein the decoded data and the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video resulting from combining the decoded data with the enhanced decoded features; wherein the at least one machine processes or analyzes the decoded data using the enhanced decoded video. 2. The apparatus of claim 1 , wherein the at least one machine comprises at least one task neural network. 3. The apparatus of claim 1 , wherein the enhanced decoded video is generated using a neural network. 4. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video resulting from combining the decoded data with the decoded residual features. 5. The apparatus of claim 4 , wherein the enhanced decoded video is generated using a neural network. 6. The apparatus of claim 1 , wherein the residual features are encoded using at least one neural network of the at least one second learned codec, and the encoded residual features are decoded using at least one neural network of the at least one second learned codec. 7. The apparatus of claim 1 , wherein the features extracted from the decoded data generated with the first codec are extracted using a neural network. 8. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: extract features from the original data; extract features from the decoded data; and generate the residual features, prior to being encoded, as a result of computing a difference between the features extracted from the decoded data and the features extracted from the original data. 9. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decode encoded residual features to generate decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task. 10. The apparatus of claim 9 , wherein the at least one machine comprises at least one task neural network. 11. The apparatus of claim 9 , wherein the combining of the decoded data with the enhanced decoded features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network. 12. The apparatus of claim 9 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video as a result of combining the decoded data with the decoded residual features; wherein the combining of the decoded data with the decoded residual features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network. 13. The apparatus of claim 9 , wherein the features are extracted from the decoded data using a neural network; and wherein the encoded residual features are decoded using a neural network of a learned codec. 14. The apparatus of claim 9 , wherein the combining of the decoded residual features with the features extracted from the decoded data to generate the enhanced decoded features is a summation of the decoded residual features and the features extracted from the decoded data. 15. The apparatus of claim 9 , wherein the encoded residual features are a difference between features extracted from the original data, and features extracted from preliminary decoded data or the features extracted from the decoded data. 16. The apparatus of claim 9 , wherein the decoded residual features are decoded using entropy decoding and dequantization. 17. The apparatus of claim 9 , wherein the decoded residual features are decoded using an image of a video decoder, the decoding of the residual features comprising converting decoded feature map images to the decoded residual features. 18. The apparatus of claim 9 , wherein the original data is video data. 19. A method comprising: decoding encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extracting features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decoding encoded residual features to generate decoded residual features; generating enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task. 20. The method of claim 19 , wherein the at least one machine comprises at least one task neural network.

Assignees

Inventors

Classifications

  • Selection of coding mode or of prediction mode · CPC title

  • using pre-processing or post-processing specially adapted for video compression · CPC title

  • using neural networks · CPC title

  • H04N19/146Primary

    Data rate or code amount at the encoder output · CPC title

  • H04N19/50Primary

    using predictive coding (H04N19/61 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11375204B2 cover?
An apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract featur…
Who is the assignee on this patent?
Nokia Technologies Oy
What technology area does this patent fall under?
Primary CPC classification H04N19/146. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 28 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).