Methods and apparatus of motion vector rounding, clipping and storage for inter prediction
US-2024333960-A1 · Oct 3, 2024 · US
US9648346B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9648346-B2 |
| Application number | US-49177509-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 25, 2009 |
| Priority date | Jun 25, 2009 |
| Publication date | May 9, 2017 |
| Grant date | May 9, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.
Opening claim text (preview).
The invention claimed is: 1. Computer-readable storage hardware storing information to enable one or more devices that process and transmit multi-view video to perform a method, the method comprising: receiving video streams of a real world scene, the video streams concurrently captured by respective cameras comprising a first camera and a second camera, the video streams comprising a first video stream captured by the first camera and a second video stream captured by the second camera; receiving different viewpoints corresponding to positions of viewing a rendering of the multi-view video at respective times; computing first compression rates for the first video stream and second compression rates for the second video stream by modelling in at least two dimensions the viewpoints relative to the positions and directions of the first and second cameras and modelling the positions and directions of the first and second cameras relative to each other, the first compression rates corresponding to the viewpoints, respectively, and the second compression rates corresponding to the viewpoints, respectively, wherein the viewpoints correspond to positions of a remote viewer; compressing the first video stream according to the first compression rates, and compressing the second video stream according to the second compression rates, respectively, wherein the multi-view video comprises the compressed first video stream and the compressed second video stream; and transmitting the compressed first video stream and the compressed second video stream, which comprise the multi-view video, via a network to a remote terminal that receives the video streams, synthesizes the video streams into a synthetic video stream, and displays the synthetic video, wherein the displayed synthetic video stream comprises the rendering of the multi-view video. 2. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are computed based on averages of positions of the remote viewer relative to a display of the remote terminal. 3. Computer-readable storage hardware according to claim 1 , wherein the positions of the remote viewer comprise predicted positions of the remote viewer. 4. Computer-readable storage hardware according to claim 3 , wherein the predicted positions are based on indicia of past positions of the remote viewer. 5. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are based on average viewer positions near a display of the remote terminal that is displaying the synthetic video stream. 6. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are based on geometry of a physical area where the multi-view video is being displayed. 7. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are computed based on received information about sensed locations of the remote viewer at the remote terminal and based on information about time that will pass before displaying the synthetic video stream by the remote terminal. 8. Computer-readable storage hardware according to claim 7 , wherein the information about the locations of the remote viewer includes information indicating movement of the remote viewer. 9. Computer-readable storage hardware according to claim 1 , wherein the method further comprises computing a mesh from a depth map of the real world scene, using the depth map to identify potential occlusions, and computing weights of such portions accordingly, the weights being used in the computing of the compression rates. 10. A computer-implemented method comprising: receiving from video cameras respective video streams of a subject being captured by the video cameras, the video streams comprising a first video stream and a second video stream, the first video stream comprised of first frames captured by a first of the video cameras, the second video stream comprised of second frames captured by a second of the video cameras, the first video stream not comprised of the second frames, and the second video stream not comprised of the first video frames; receiving an indication of a viewpoint from a remote terminal, the viewpoint corresponding to a position or direction of a user at the remote terminal relative to the remote terminal or a display thereof; computing a first weight for the first video stream, and computing a second weight for the second video stream, wherein the computing the first and second weights is based on: (i) a direction or position corresponding to the first video camera, (ii) a direction or position corresponding to the second video camera, and (iii) a position or direction of the viewpoint relative to the directions and positions of the first and second video cameras, wherein the position or direction of the viewpoint corresponds to a position or direction within view areas of the first and second video cameras, respectively; compressing the first video stream according to the first weight, and compressing the second video stream according to the second weight; and transmitting the compressed first video stream and the compressed second video stream via a network to the remote device, wherein the remote device synthesizes the first and second video streams into a synthetic video stream displayed by the remote device. 11. A computer-implemented method according to claim 10 , wherein the viewpoint corresponds to one or more of: a past position of a remote viewer at the remote device, a layout of a room where the multi-view video is being displayed, or a latency value comprising a network latency. 12. A computer-implemented method according to claim 10 , wherein the computing further computes: first portion-specific compression rates for respective individual portions of a first video frame of the first video stream, and second portion-specific compression rates for respective individual portions of a second video frame of the second video stream. 13. A computer-implemented method according to claim 10 wherein the modelling further comprises forming a three-dimensional model of the scene relative to the viewpoints and relative to the positions and directions of the cameras, and wherein the first compression rate and the second compression rate are computed based at least in part on the three-dimensional model. 14. A computer-implemented method according to claim 10 , further comprising identifying occlusions in the video streams and computing the compression rates according to the occlusions. 15. A computer-implemented method according to claim 10 , wherein the viewpoint is computed by estimating a future position of the user. 16. A computer-implemented method according to claim 10 , wherein the computing the first weight is further based on a model of the subject, and wherein computing the second weight is further based on the model of the subject. 17. A computer-implemented method according to claim 16 , further comprising computing the first weight and the second weight by finding an intersection of the model of the subject with a first ray projected from a point positioned according to the position of the viewpoint, computing the first weight by projecting a second ray from the intersection to a second point positioned according the position or direction corresponding to the first video camera, and computing the second weight by projecting a third ray from the intersection to a third point positioned according to the position or direction corresponding to the second video camera. 18. A computer-implemented method according to claim 17 , wherein the
specially adapted for multi-view video sequence encoding · CPC title
the unit being a scalable video layer · CPC title
using feature points or meshes · CPC title
Selection of the code volume for a coding unit prior to coding · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.