Techniques of multi-hypothesis motion compensation
US-2023007272-A1 · Jan 5, 2023 · US
US12341971B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12341971-B2 |
| Application number | US-202318157360-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 20, 2023 |
| Priority date | Jan 31, 2022 |
| Publication date | Jun 24, 2025 |
| Grant date | Jun 24, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for generating virtual reference frames that may be used for prediction of input video frames. The virtual reference frames may be derived from already-coded reference frames and thereby incur reduced signaling overhead. Moreover, signaling of virtual reference frames may be avoided until an encoder selects the virtual reference frame as a prediction reference for a current frame. In this manner, the techniques proposed herein contribute to improved coding efficiencies.
Opening claim text (preview).
We claim: 1. A video coding method, comprising: predictively coding input frames, when a coded input frame is designated as a reference frame, decoding the coded data of the reference frame; storing the decoded reference frame data for use as a prediction reference of subsequently-coded input frame; and generating data of a virtual reference frame from a pair of stored reference frames, the generating comprising: for a first spatial portion of the virtual reference frame that is traversed by a first motion vector associated with one of the pair of stored reference frames, predicting content for the first spatial portion from content referenced by the motion vector, and for a second spatial portion of the virtual reference frame that is not traversed by any motion vector associated with the pair of stored reference frames, predicting content for the second spatial portion from content identified by a predicted motion vector for the second spatial portion; wherein the predictive coding of an input frame includes a prediction search from among the reference frame data and virtual reference frame data. 2. The method of claim 1 , further comprising: when the prediction search selects the virtual reference frame, outputting to a decoder data representing the virtual reference frame and data representing the input frame predictively coded with reference to the virtual reference frame. 3. The method of claim 1 , wherein when a virtual reference frame is selected by no prediction search, data representing the virtual reference frame is not output to a decoder. 4. The method of claim 1 , wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second reference frame of the pair has a temporal position on a second side of the temporal position of the virtual reference frame. 5. The method of claim 1 , wherein, when a current pixel block of an input frame is coded predictively with respect to the virtual reference frame, a motion vector of the pixel block is derived with reference to motion vectors of other pixel blocks neighboring the current pixel block that use a common set of reference frames for prediction as the current pixel block. 6. The method of claim 1 , further comprising providing data representing the virtual reference frame to a channel, including temporal interpolated mode identifier indicating a decoder usage of the virtual reference frame. 7. The method of claim 6 wherein the temporal interpolated mode identifier takes one of the following states: a first state indicating that the decoder shall use the virtual reference frame as a reference frame; a second state indicating that the decoder shall output the virtual reference frame for display, and a third state indicating that the decoder shall output the virtual reference frame for display enhanced by additional information supplied by an encoder. 8. The method of claim 1 , further comprising providing data identifying a mode of the predictive coding of the input frame. 9. The method of claim 8 , wherein the predictive coding mode information takes one of the following states: a No_Skip state indicating that the predictive coding generates block level motion information and residual information of coded input frame content, a Full_Skip state indicating that the predictive coding uses direct motion vector interpolation without use of supplementary coding data, and a Semi_Skip state indicating that the predictive coding uses direct motion vector interpolation and includes supplementary coding data. 10. The method of claim 1 , further comprising, when a pixel block of the input frame is predictively coded with reference to the virtual reference frame and motion vectors obtained from the predictive coding are smaller than a threshold value, transmitting coded data of the pixel block with a syntax element identifying the motion vectors as having zero values. 11. The method of claim 1 , further comprising, wherein the prediction search of a pixel block of the input frame is constrained to a predetermined search window about a collocated location of the virtual reference frame. 12. The method of claim 1 , wherein content of the virtual reference frame is generated an optical flow motion vector refinement technique. 13. The method of claim 1 , wherein the predicted motion vector is predicted from motion of a content element from a first one of the pair of reference frames to another reference frame, and motion of the content element from the other reference frame to a second one of the pair of reference frames. 14. The method of claim 1 , wherein the prediction motion vector is predicted from a motion vector of a third spatial portion of the virtual reference frame proximate to the second spatial portion of the reference frame. 15. Non-transitory computer readable medium having program instruction stored thereon that, when executed by a processing device, causes the processing device to: predictively code input frames, when a coded input frame is designated as a reference frame, decode the coded data of the reference frame, store the decoded reference frame data for use as a prediction reference of subsequently-coded input frame, generate data of a virtual reference frame from a pair of stored reference frames according to: for a first spatial portion of the virtual reference frame that is traversed by a first motion vector associated with one of the pair of stored reference frames, predicting content for the first spatial portion from content referenced by the motion vector, and for a second spatial portion of the virtual reference frame that is not traversed by any motion vector associated with the pair of stored reference frames, predicting content for the second spatial portion from content identified by a predicted motion vector for the second spatial portion; wherein the predictive coding of an input frame includes a prediction search from among the reference frame data and virtual reference frame data. 16. An encoding terminal, comprising: a video encoder having an input for source video; a video decoder having an input for coded video from the video encoder; a reference picture buffer to store decoded reference frames output from the video decoder; a virtual reference picture generator having an input for a pair of reference frames from the reference picture buffer and having on output for data of a virtual reference frame, wherein: for a first spatial portion of the virtual reference frame that is traversed by a first motion vector associated with one of the pair of stored reference frames, the first spatial portion is predicted from content referenced by the motion vector, and for a second spatial portion of the virtual reference frame that is not traversed by any motion vector associated with the pair of stored reference frames, the second spatial portion is predicted from content identified by a predicted motion vector for the second spatial portion; and a virtual reference picture buffer having an input for virtual reference frames output by the virtual reference picture generator. 17. The terminal of claim 16 , further comprising a predictor having inputs for reference frames from the reference picture buffer and for virtual reference frames from the virtual reference picture buffer. 18. The terminal of claim 16 , wherein a first reference frame of the pair has a temporal position on a first side of the temporal position of the virtual reference frame, and a second r
the region being a block, e.g. a macroblock · CPC title
characterised by syntax aspects related to video coding, e.g. related to compression standards · CPC title
in combination with predictive coding · CPC title
Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability · CPC title
Motion compensation with bidirectional frame interpolation, i.e. using B-pictures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.