Hybrid video and feature coding and decoding
US-2021203997-A1 · Jul 1, 2021 · US
US11375204B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11375204-B2 |
| Application number | US-202117218967-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 31, 2021 |
| Priority date | Apr 7, 2020 |
| Publication date | Jun 28, 2022 |
| Grant date | Jun 28, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; decode encoded residual features to generate decoded residual features; and generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: downsample original data to a lower resolution prior to encoding the original data; encode the downsampled original data with a first codec to generate encoded data with a bitrate lower than that of the original data, and decode the encoded data to generate decoded data; encode the original data with at least one second learned codec to generate encoded residual features and decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with features extracted from the decoded data generated with the first codec; wherein the decoded data and the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video resulting from combining the decoded data with the enhanced decoded features; wherein the at least one machine processes or analyzes the decoded data using the enhanced decoded video. 2. The apparatus of claim 1 , wherein the at least one machine comprises at least one task neural network. 3. The apparatus of claim 1 , wherein the enhanced decoded video is generated using a neural network. 4. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video resulting from combining the decoded data with the decoded residual features. 5. The apparatus of claim 4 , wherein the enhanced decoded video is generated using a neural network. 6. The apparatus of claim 1 , wherein the residual features are encoded using at least one neural network of the at least one second learned codec, and the encoded residual features are decoded using at least one neural network of the at least one second learned codec. 7. The apparatus of claim 1 , wherein the features extracted from the decoded data generated with the first codec are extracted using a neural network. 8. The apparatus of claim 1 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: extract features from the original data; extract features from the decoded data; and generate the residual features, prior to being encoded, as a result of computing a difference between the features extracted from the decoded data and the features extracted from the original data. 9. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: decode encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extract features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decode encoded residual features to generate decoded residual features; generate enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task. 10. The apparatus of claim 9 , wherein the at least one machine comprises at least one task neural network. 11. The apparatus of claim 9 , wherein the combining of the decoded data with the enhanced decoded features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network. 12. The apparatus of claim 9 , wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: generate the enhanced decoded video as a result of combining the decoded data with the decoded residual features; wherein the combining of the decoded data with the decoded residual features to generate the enhanced decoded video is performed using a neural network; and wherein the at least one machine task used to process or analyze the enhanced decoded video comprises at least one task neural network. 13. The apparatus of claim 9 , wherein the features are extracted from the decoded data using a neural network; and wherein the encoded residual features are decoded using a neural network of a learned codec. 14. The apparatus of claim 9 , wherein the combining of the decoded residual features with the features extracted from the decoded data to generate the enhanced decoded features is a summation of the decoded residual features and the features extracted from the decoded data. 15. The apparatus of claim 9 , wherein the encoded residual features are a difference between features extracted from the original data, and features extracted from preliminary decoded data or the features extracted from the decoded data. 16. The apparatus of claim 9 , wherein the decoded residual features are decoded using entropy decoding and dequantization. 17. The apparatus of claim 9 , wherein the decoded residual features are decoded using an image of a video decoder, the decoding of the residual features comprising converting decoded feature map images to the decoded residual features. 18. The apparatus of claim 9 , wherein the original data is video data. 19. A method comprising: decoding encoded data to generate decoded data, the encoded data having a bitrate lower than that of original data, and extracting features from the decoded data; wherein the encoded data is generated from downsampling the original data to a lower resolution prior to encoding the original data, and encoding the downsampled original data with a codec; decoding encoded residual features to generate decoded residual features; generating enhanced decoded features as a result of combining the decoded residual features with the features extracted from the decoded data; wherein the enhanced decoded features are configured to be processed or analyzed with at least one machine; generate enhanced decoded video as a result of combining the decoded data with the enhanced decoded features; and process or analyze the enhanced decoded video using at least one machine task. 20. The method of claim 19 , wherein the at least one machine comprises at least one task neural network.
Selection of coding mode or of prediction mode · CPC title
using pre-processing or post-processing specially adapted for video compression · CPC title
using neural networks · CPC title
Data rate or code amount at the encoder output · CPC title
using predictive coding (H04N19/61 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.