Multi-view video compression and streaming based on viewpoints of remote viewer

US9648346B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9648346-B2
Application numberUS-49177509-A
CountryUS
Kind codeB2
Filing dateJun 25, 2009
Priority dateJun 25, 2009
Publication dateMay 9, 2017
Grant dateMay 9, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.

First claim

Opening claim text (preview).

The invention claimed is: 1. Computer-readable storage hardware storing information to enable one or more devices that process and transmit multi-view video to perform a method, the method comprising: receiving video streams of a real world scene, the video streams concurrently captured by respective cameras comprising a first camera and a second camera, the video streams comprising a first video stream captured by the first camera and a second video stream captured by the second camera; receiving different viewpoints corresponding to positions of viewing a rendering of the multi-view video at respective times; computing first compression rates for the first video stream and second compression rates for the second video stream by modelling in at least two dimensions the viewpoints relative to the positions and directions of the first and second cameras and modelling the positions and directions of the first and second cameras relative to each other, the first compression rates corresponding to the viewpoints, respectively, and the second compression rates corresponding to the viewpoints, respectively, wherein the viewpoints correspond to positions of a remote viewer; compressing the first video stream according to the first compression rates, and compressing the second video stream according to the second compression rates, respectively, wherein the multi-view video comprises the compressed first video stream and the compressed second video stream; and transmitting the compressed first video stream and the compressed second video stream, which comprise the multi-view video, via a network to a remote terminal that receives the video streams, synthesizes the video streams into a synthetic video stream, and displays the synthetic video, wherein the displayed synthetic video stream comprises the rendering of the multi-view video. 2. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are computed based on averages of positions of the remote viewer relative to a display of the remote terminal. 3. Computer-readable storage hardware according to claim 1 , wherein the positions of the remote viewer comprise predicted positions of the remote viewer. 4. Computer-readable storage hardware according to claim 3 , wherein the predicted positions are based on indicia of past positions of the remote viewer. 5. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are based on average viewer positions near a display of the remote terminal that is displaying the synthetic video stream. 6. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are based on geometry of a physical area where the multi-view video is being displayed. 7. Computer-readable storage hardware according to claim 1 , wherein the viewpoints are computed based on received information about sensed locations of the remote viewer at the remote terminal and based on information about time that will pass before displaying the synthetic video stream by the remote terminal. 8. Computer-readable storage hardware according to claim 7 , wherein the information about the locations of the remote viewer includes information indicating movement of the remote viewer. 9. Computer-readable storage hardware according to claim 1 , wherein the method further comprises computing a mesh from a depth map of the real world scene, using the depth map to identify potential occlusions, and computing weights of such portions accordingly, the weights being used in the computing of the compression rates. 10. A computer-implemented method comprising: receiving from video cameras respective video streams of a subject being captured by the video cameras, the video streams comprising a first video stream and a second video stream, the first video stream comprised of first frames captured by a first of the video cameras, the second video stream comprised of second frames captured by a second of the video cameras, the first video stream not comprised of the second frames, and the second video stream not comprised of the first video frames; receiving an indication of a viewpoint from a remote terminal, the viewpoint corresponding to a position or direction of a user at the remote terminal relative to the remote terminal or a display thereof; computing a first weight for the first video stream, and computing a second weight for the second video stream, wherein the computing the first and second weights is based on: (i) a direction or position corresponding to the first video camera, (ii) a direction or position corresponding to the second video camera, and (iii) a position or direction of the viewpoint relative to the directions and positions of the first and second video cameras, wherein the position or direction of the viewpoint corresponds to a position or direction within view areas of the first and second video cameras, respectively; compressing the first video stream according to the first weight, and compressing the second video stream according to the second weight; and transmitting the compressed first video stream and the compressed second video stream via a network to the remote device, wherein the remote device synthesizes the first and second video streams into a synthetic video stream displayed by the remote device. 11. A computer-implemented method according to claim 10 , wherein the viewpoint corresponds to one or more of: a past position of a remote viewer at the remote device, a layout of a room where the multi-view video is being displayed, or a latency value comprising a network latency. 12. A computer-implemented method according to claim 10 , wherein the computing further computes: first portion-specific compression rates for respective individual portions of a first video frame of the first video stream, and second portion-specific compression rates for respective individual portions of a second video frame of the second video stream. 13. A computer-implemented method according to claim 10 wherein the modelling further comprises forming a three-dimensional model of the scene relative to the viewpoints and relative to the positions and directions of the cameras, and wherein the first compression rate and the second compression rate are computed based at least in part on the three-dimensional model. 14. A computer-implemented method according to claim 10 , further comprising identifying occlusions in the video streams and computing the compression rates according to the occlusions. 15. A computer-implemented method according to claim 10 , wherein the viewpoint is computed by estimating a future position of the user. 16. A computer-implemented method according to claim 10 , wherein the computing the first weight is further based on a model of the subject, and wherein computing the second weight is further based on the model of the subject. 17. A computer-implemented method according to claim 16 , further comprising computing the first weight and the second weight by finding an intersection of the model of the subject with a first ray projected from a point positioned according to the position of the viewpoint, computing the first weight by projecting a second ray from the intersection to a second point positioned according the position or direction corresponding to the first video camera, and computing the second weight by projecting a third ray from the intersection to a third point positioned according to the position or direction corresponding to the second video camera. 18. A computer-implemented method according to claim 17 , wherein the

Assignees

Inventors

Classifications

  • specially adapted for multi-view video sequence encoding · CPC title

  • the unit being a scalable video layer · CPC title

  • H04N19/54Primary

    using feature points or meshes · CPC title

  • Selection of the code volume for a coding unit prior to coding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9648346B2 cover?
Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using…
Who is the assignee on this patent?
Zhang Cha, Florencio Dinei A, Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification H04N19/54. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 09 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).