Video processing method and apparatus, device, and medium
US-2024402902-A1 · Dec 5, 2024 · US
US9723223B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9723223-B1 |
| Application number | US-201314040435-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 27, 2013 |
| Priority date | Dec 2, 2011 |
| Publication date | Aug 1, 2017 |
| Grant date | Aug 1, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A server includes an input node to receive video streams forming a panoramic video. The server also receives audio tracks corresponding to the video streams. A module forms an audio track based upon a combination of at least two of the audio tracks and directional viewing data. The audio track may be a stereo, mixed or surround sound audio track with volume modulation based upon the directional viewing data. An output node sends the audio track to a client device.
Opening claim text (preview).
The invention claimed is: 1. A server, programmed to: receive first video data comprising a first frame captured by a first camera; receive second video data comprising a second frame captured by a second camera; receive a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction; receive a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction; generate a first panoramic frame based at least in part on the first frame and the second frame; receive first view direction data describing a first view direction offset from the first microphone direction and offset from the second microphone direction; form a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises: a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; and a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; send the first stereo audio track to a client device; receive second view direction data describing a second view direction different than the first view direction; form a second stereo audio track based at least in part upon the second view direction, wherein the second stereo audio track comprises: a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and send the second stereo audio track to the client device. 2. The server of claim 1 wherein the first stereo audio track is a surround sound audio track. 3. The server of claim 2 wherein the surround sound audio track is modulated based upon the first view direction. 4. The server of claim 1 wherein a volume of the first stereo audio track is modulated based upon an item of interest in a video file. 5. The server of claim 1 wherein the server is further programmed to send to the client device a portion of the first panoramic frame corresponding to a field of view of a user, and wherein the first stereo audio track is modulated to include an aural clue corresponding to an event of potential interest outside of the field of view of the user. 6. The server of claim 1 wherein the second view direction is offset from the first view direction by about one hundred and eighty degrees. 7. A computer-implemented method of generating a panoramic video, the method comprising: receiving, by a server, first video data comprising a first frame captured by a first camera; receiving, by the server, second video data comprising a second frame captured by a second camera; receiving, by the server, a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction; receiving, by the server, a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction; generating, by the server, a first panoramic frame based at least in part on the first frame and the second frame; receiving, by the server, first view direction data describing a first view direction offset from the first microphone direction and offset from the second microphone direction; forming, by the server, a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises: a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; and a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; sending, by the server, the first stereo audio track to a client device; receiving, by the server, second view direction data describing a second view direction different than the first view direction; forming, by the server, a second stereo audio track based at least in part on the second view direction, wherein the second stereo audio track comprises: a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and sending, by the server, the second stereo audio track to the client device. 8. The method of claim 7 wherein the second view direction is offset from the first view direction by about one hundred and eighty degrees. 9. The method of claim 7 wherein the first stereo audio track is a surround sound audio track. 10. The method of claim 9 wherein the surround sound audio track is modulated based upon the first view direction. 11. The method of claim 7 wherein a volume of the first stereo audio track is modulated based upon an item of interest in a video file. 12. The method of claim 7 wherein the server is further programmed to send to the client device a portion of the first panoramic frame corresponding to a field of view of a user, and wherein the first stereo audio track is modulated to include an aural clue corresponding to an event of potential interest outside of a field of view of the user. 13. A server comprising a non-transitory computer readable storage medium including computer code for performing a method comprising: receiving first video data comprising a first frame captured by a first camera; receiving second video data comprising a second frame captured by a second camera; receiving a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction; receiving a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction; generating a first panoramic frame based at least in part on the
for achieving an enlarged field of view, e.g. panoramic image capture · CPC title
Electronic editing of digitised analogue information signals, e.g. audio or video signals · CPC title
Mixing · CPC title
involving special audio data, e.g. different tracks for different languages · CPC title
Video hosting of uploaded data from client · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.