Autonomous video conferencing system with virtual director assistance
US-2024414437-A1 · Dec 12, 2024 · US
US9762856B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9762856-B2 |
| Application number | US-201314648057-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 25, 2013 |
| Priority date | Nov 29, 2012 |
| Publication date | Sep 12, 2017 |
| Grant date | Sep 12, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A video conferencing server ( 100 ) receives and combines video streams captured by cameras of plural video clients ( 101 ) and generates immersive video streams ( 124, 125 ) for delivery to and play-out by these video clients ( 101 ). A cut-out module ( 102 ) in the video conferencing server ( 100 ) generates a foreground mask ( 122 ) for a video frame ( 121 ) received from a conferencing client ( 101 ). A camera shake detector ( 103 ) determines a displacement vector ( 123 ) for a subset of features in the video frame ( 121 ). The displacement vector ( 123 ) represents a two-dimensional motion of the subset of features between a background mask and a previous background mask for a previous video frame received from the same conferencing client ( 101 ). A camera shake correcting module ( 102, 104 ) applies a displacement opposite to the displacement vector ( 123 ) to the foreground mask ( 122 ) before use thereof in the immersive video streams ( 124, 125 ) for conferencing clients ( 101 ), and a signalling unit ( 104 ) generates a shake indication ( 311, 312 ) into the immersive video stream ( 124 ) delivered to the conferencing client ( 101 ) whose camera is shaking.
Opening claim text (preview).
The invention claimed is: 1. A video conferencing server for immersive video conferencing, said video conferencing server being adapted to receive and combine video streams captured by cameras of plural video clients and to generate immersive video streams for delivery to and play-out by said plural video clients, wherein said video conferencing server comprises: a cut-out module adapted to generate a foreground mask for a video frame received from a conferencing client; a camera shake detector adapted to determine a displacement vector for a subset of features in said video frame, said displacement vector representing a two-dimensional motion of said subset of features between a background mask obtained by inverting said foreground mask and a previous background mask generated for a previous video frame received from said conferencing client; a camera shake correcting module adapted to apply a displacement opposite to said displacement vector to said foreground mask before use thereof in said immersive video streams to thereby correct camera shake effects of said conferencing client; and a signaling unit adapted to generate a shake indication into an immersive video stream delivered to said conferencing client. 2. A video conferencing server according to claim 1 , wherein said camera shake detector comprises: a video stream processor for selecting a set of features in said video frame and said previous video frame; filtering logic for filtering said set of features to obtain a subset of features that belongs to said background mask of said video frame and to said previous background mask of said previous frame; and processing logic for computing a sparse optic flow for said subset of features through the Pyramidal Lukas-Kanade algorithm. 3. A video conferencing server according to claim 2 , wherein said camera shake detector further comprises: statistical logic for calculating from said sparse optical flow for said subset of features through statistical averaging a motion magnitude and motion direction that form said displacement vector. 4. A video conferencing server according to claim 2 , wherein said camera shake detector further comprises: processing logic configured to compare for each feature in said subset of features the magnitude of said sparse optical flow to a predetermined threshold, and configured to discard said feature from calculating said displacement vector when said magnitude of said sparse optical flow is below said predetermined threshold. 5. A video conferencing server according to claim 2 , wherein said camera shake detector further comprises: processing logic configured to assign each feature in said subset of features according to a direction of its sparse optical flow to a first bin out of a first set of n histogram bins each covering a range of 360 degrees/n and to a second bin out of a second set of n histogram bins each covering a range of 360 degrees/n, n being a positive integer value, and said second set of n bins being rotated over 180 degrees/n with respect to said first set of n bins; processing logic configured to select a dominant bin amongst said first set of n bins and said second set of n bins, containing the highest amount of features from said subset of features; and processing logic configured to discard all features that do not belong to said dominant bin from calculating said displacement vector. 6. A method for camera shake detection in a video conferencing server being adapted to receive and combine video streams captured by cameras of plural video clients and to generate immersive video streams for delivery to and play-out by said plural video clients, wherein said method comprises: generating a foreground mask for a video frame received from a conferencing client; determining a displacement vector for a subset of features in said video frame, said displacement vector representing a two-dimensional motion of said subset of features between a background mask obtained by inverting said foreground mask and a previous background mask generated for a previous video frame received from said conferencing client; applying a displacement opposite to said displacement vector to said foreground mask before use thereof in said immersive video streams to thereby correct camera shake effects of said conferencing client; and generating a shake indication into an immersive video stream delivered to said conferencing client. 7. A data processing system comprising means for carrying out the method of claim 6 . 8. A computer program comprising software code adapted to perform the method of claim 6 . 9. A computer non-transitory computer-readable storage medium comprising the computer program of claim 8 .
based on the image signal · CPC title
performed by a processor, e.g. controlling the readout of an image memory · CPC title
Conference systems · CPC title
Constructional details of the terminal equipment, e.g. arrangements of the camera and the display · CPC title
Electricity · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.