Image data encoding/decoding method and apparatus
US-2024357168-A1 · Oct 24, 2024 · US
US9973779B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9973779-B2 |
| Application number | US-201314378955-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 6, 2013 |
| Priority date | Mar 12, 2012 |
| Publication date | May 15, 2018 |
| Grant date | May 15, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A sequence of 3D VDR images and 3D SDR images are encoded using a monoscopic SDR base layer and one or more enhancement layers. A first VDR view and a first SDR view are encoded with a DVDL encoder to output first and second coded signals. A predicted 3D VDR signal is generated, which has first and second predicted VDR views. First and second VDR residuals are generated based on their respective VDR views and predicted VDR views. A DVDL encoder encodes the first and second VDR residuals to output third and fourth coded signals. A 3D VDR decoder, which has two DVDL decoders and SDR-to-VDR predictors use the four coded input signals to generate a single-view SDR, 3D SDR, single-view VDR, or 3D VDR signals. A corresponding decoder is also described, which is capable of decoding these encoded 3D VDR and SDR images.
Opening claim text (preview).
What is claimed is: 1. A method for generating a multi-layer encoded video bitstream, wherein the multi-layer encoded video bitstream comprises a base layer and one or more enhancement layers, the method comprising: receiving a first 3D video signal having a first dynamic range, the first 3D video signal comprising a first view having the first dynamic range and a second view having the first dynamic range; generating, based at least in part on the first 3D video signal, a second 3D video signal having a second dynamic range, the second 3D video signal comprising a third view having the second dynamic range and a fourth view having the second dynamic range, wherein the first dynamic range is a higher dynamic range than the second dynamic range; encoding the third view and the fourth view using a first encoder to output a first coded video signal having the second dynamic range and a second coded video signal having the second dynamic range; generating a predicted 3D video signal having the first dynamic range based on interpolated decoded images of the second 3D video signal, wherein the predicted 3D video signal comprises a first predicted view having the first dynamic range and a second predicted view having the first dynamic range, wherein an interpolation process selects an optimal filter to minimize the prediction error between the first 3D video signal and the interpolated decoded images; generating a first residual signal having the first dynamic range based on the first view and the first predicted view; generating a second residual signal having the first dynamic range based on the second view and the second predicted view; and encoding the first residual signal and the second residual signal using a second encoder to output a first coded residual signal and a second coded residual signal; inserting the first coded video signal into the base layer of the multi-layer encoded video bitstream; and inserting the second coded video signal, the first coded residual signal, and the second coded residual signal into one or more of the enhancement layers of the multi-layer encoded video bitstream; wherein the multi-layer encoded video bitstream is encoded in a multi-layer structure which (a) allows a monoscopic display device of the second dynamic range to retrieve single-view images of the second dynamic range from the multi-layer structure for displaying, (b) allows a stereoscope display device of the second dynamic range to retrieve both first-view and second-view images of the second dynamic range from the multi-layer structure for displaying, and (c) allows a stereoscopic display device of the first dynamic range to retrieve both first-view and second-view images of the first dynamic range from the multi-layer structure for displaying. 2. The method as recited in claim 1 , wherein the second 3D video signal is derived from the 3D video signal having the first dynamic range using a mapping function from the first dynamic range to the second dynamic range. 3. The method as recited in claim 1 , wherein generating the predicted 3D video signal comprises: decoding the first coded video signal and the second coded video signal using a decoder to output a first decoded view having the second dynamic range and a second decoded view having the second dynamic range; applying a first predictor from the second dynamic range to the first dynamic range to the first decoded view to generate the first predicted view; and applying a second predictor from the second dynamic range to the first dynamic range to the second decoded view to generate the second predicted view. 4. The method as recited in claim 1 , wherein the first and/or second encoder comprises an H.264 multi-view encoder. 5. The method as recited in claim 1 , wherein the steps of encoding comprise: encoding a first video signal using a base layer encoder to generate a first resulting video signal; generating a plurality of base layer reference frames based on the first resulting video signal; generating a plurality of enhancement layer reference frames based on the plurality of base layer reference frames; encoding a second video signal using an enhancement layer encoder to generate a second resulting video signal, wherein the enhancement layer encoder may use reference frames from both the second video signal or the plurality of enhancement layer frames; wherein, if the encoding is performed by the first encoder, the first video signal represents the third view, the second video signal represents the fourth view, the first resulting video signal is the first coded video signal, and the second resulting video signal is the second coded video signal; and wherein, if the encoding is performed by the second encoder, the first video signal is the first residual signal, the second video signal is the second residual signal, the first resulting video signal is the first coded residual signal, and the second resulting video signal is the second coded residual signal. 6. The method as recited in claim 5 , wherein generating the plurality of enhancement layer reference frames based on the plurality of base layer reference frames comprises using a reference processing unit to process information from the base layer encoder before utilizing this information as a potential predictor for the enhancement layer in the enhancement layer encoder ( 230 ), and wherein the information related to the processing by the reference processing unit is inserted as metadata into the multi-layer encoded video bitstream. 7. An apparatus comprising a processor and configured to perform the method recited in claim 1 . 8. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for executing a method in accordance with claim 1 . 9. The method as recited in claim 1 , wherein the first view and the second view represent a pair of left and right views, and wherein the third view and the fourth view represent a second pair of left and right views. 10. A method for decoding a coded 3D video signal having a first dynamic range, the method comprising, for a first coded video signal having a second dynamic range, wherein the first coded video signal represents a base layer of the coded 3D video signal, and wherein the first dynamic range is a higher dynamic range than the second dynamic range, and for a second coded video signal having the second dynamic range, a first coded residual signal, and a second coded residual signal representing one or more enhancement layers of the coded 3D video signal: decoding the first coded video signal and the second coded video signal using a first decoder to generate a first view video signal having the second dynamic range and a second view video signal having the second dynamic range; generating a first view base layer video signal having the first dynamic range based on the first view video signal; generating a second view base layer video signal having the first dynamic range based on the second view video signal; decoding the first coded residual signal and the second coded residual signal using a second decoder to generate a first view residual signal having the first dynamic range and a second view residual signal having the first dynamic range; wherein the first coded residual signal is generated by a video encoder based in part on a first predicted view in a predicted 3D video signal having the first dynamic range, wherein the second coded residual signal is generated by the video encoder based in part on a second predicted view in the predicted 3D video signal having the first dynamic range; wherein the predicted 3D video signal having the first dynamic range is generated by the video encoder bas
specially adapted for multi-view video sequence encoding · CPC title
using pre-processing or post-processing specially adapted for video compression · CPC title
involving reduction of coding artifacts, e.g. of blockiness · CPC title
using hierarchical techniques, e.g. scalability (H04N19/63 takes precedence) · CPC title
Aspects relating to the "2D+depth" image format · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.