Audio data synthesis terminal, audio data recording terminal, audio data synthesis method, audio output method, and program
US-2015112686-A1 · Apr 23, 2015 · US
US10356393B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10356393-B1 |
| Application number | US-201514623417-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 16, 2015 |
| Priority date | Feb 16, 2015 |
| Publication date | Jul 16, 2019 |
| Grant date | Jul 16, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Image and audio data can be captured over a period of time. The image data can be captured by a plurality of cameras positioned to capture images that sufficiently represent an environment (e.g., a movie set, scene, or office setting). The audio data can be captured over the period of time by a plurality of microphones spatially arranged throughout the environment. The images can be stitched or otherwise combined to generate a three-dimensional representation of the environment and objects in the environment (e.g., people or furniture), where the three-dimensional representation reflects changes (e.g., object movement or changes in lighting) that occurred in the environment over the period. For each period of time, audio data can be mapped to a corresponding region of the environment. Information representing a virtual environment of the three-dimensional representation of the environment can be encoded for device playback and stored.
Opening claim text (preview).
What is claimed is: 1. A computing system, comprising: a processor; and a memory having computer-executable instructions that, when executed by the processor, cause the processor to: receive image data captured over a period of time, the image data being captured by a plurality of cameras positioned to capture images that sufficiently represent an environment; receive second image data captured over the period of time, the second image data being captured by an image sensor, including depth data and low resolution image data; receive audio data captured over the period of time by a plurality of microphones spatially arranged throughout the environment; stitch the images to generate a three-dimensional representation of the environment, the three-dimensional representation reflecting changes that occurred in the environment over the period of time; map, based at least in part on respective locations of the plurality of microphones and for a number of time periods during the period of time, portions of the audio data to a region of the environment; and provide access to a virtual representation of the environment by a user, wherein the user can navigate the virtual representation of the environment from a particular view and region at a particular time, and wherein respective portions of the audio data are presented based at least in part on the particular view, a relative position of the user with respect to the particular view corresponding to a view direction, and the region at the particular time in the virtual representation of the environment. 2. The computing system of claim 1 , wherein the instructions when executed further enable the computing system to: receive a request indicating a first location and a first view direction in the virtual representation environment; determine a first view to provide to the user based at least in part the first location and the first view direction; determine audio to be provided based at least in part on the first view; and provide information indicative of the first view and the audio to the user. 3. The computing system of claim 2 , wherein the instructions when executed further enable the computing system to: determine a change from the first location to a second location; and update the first view and the audio data based at least in part on the second location. 4. The computing system of claim 3 , wherein the instructions when executed further cause the computing system to: emit at a designated period an infrared pulse and audio pulse; detect the infrared pulse and the audio pulse; synchronize image capture and audio capture. 5. A computing system, comprising: a processor; and a memory having computer-executable instructions that, when executed by the processor, cause the processor to: receive image data captured over a period of time, the image data being captured by a plurality of cameras positioned to capture images that sufficiently represent an environment; receive second image data captured over the period of time, the second image data being captured by an image sensor, including depth data and low resolution image data; receive audio data captured over the period of time by a plurality of microphones spatially arranged throughout the environment; generate a three-dimensional representation of the environment; map, based at least in part on respective locations of the plurality of microphones and for a number of time periods during the period of time, portions of the audio data to a region of the environment; receive a view direction and position; and provide a virtual representation of the environment as represented over the period of time based at least in part on the view direction and the position, wherein respective portions of the audio are presented based at least in part on the view direction, a relative position of the user with respect to the particular view corresponding to a viewing direction, and particular region for a particular time in the virtual representation of the environment. 6. The computing system of claim 5 , wherein the instructions, when executed further enable the computing system to: decompose the image data into a plurality of images, each image of the plurality of images being annotated with timing data and positional data; generate a first set of images that are spatially organized based at least in part on the positional data; generate a second set of images that are temporally organized based at least in part on the timing data; and stitch the first set of images and the second set of images to generate a three-dimensional representation of the environment, the three-dimensional representation reflecting changes that occurred in the environment over the period of time. 7. The computing system of claim 5 , wherein the instructions when executed to generate the three-dimensional representation further enable the computing system to: determine a representation of an object in the image data; perform motion tracking on the object; and generate motion data and positional data for the object. 8. The computing system of claim 7 , wherein the instructions when executed further enable the computing system to: determine a texture map of the representation of the object. 9. The computing system of claim 8 , wherein the instructions when executed further enable the computing system to: generate a three-dimensional model and textures for the environment and the object based at least in part on the motion data, positional data, and texture map. 10. The computing system of claim 5 , wherein the instructions when executed further enable the computing system to: receive a request indicating a first location and a first view direction in the virtual representation of the environment; determine a first view to provide to a user of a computing device based at least in part the first location and the first view direction; determine first audio to be provided based at least in part on the first view; and provide information indicative of the first view and the audio data to the user. 11. The computing system of claim 10 , wherein the instructions when executed further enable the computing system to: determine a change from the first location to a second location; determine a second view based at least in part on the change; determine a second audio based at least in part on the second view; and provide a second representation of the virtual representation of the environment based at least in part on the second view and the second audio. 12. The computing system of claim 5 , wherein the instructions when executed further enable the computing system to: emit at a designated period an infrared pulse; detect the infrared pulse; and synchronize image capture for the plurality of cameras. 13. The computing system of claim 5 , wherein the instructions when executed, further enable the computing system to: emit at a designated period an audio notification; detect the audio notification; and synchronize audio captured by a plurality of microphones spatially arranged throughout the environment. 14. The computing system of claim 5 , wherein a user controls a viewpoint of the virtual representation of the environment based at least on one of movement of a computing device displaying the virtual representation of the environment, voice commands, interaction with virtual controls displayed on a display screen of the computing device, interaction with physical controls coupled to the computing device, interaction with physical controls remote the computing device, or interaction with virtual controls remot
Control of parameters via user interfaces · CPC title
for achieving an enlarged field of view, e.g. panoramic image capture · CPC title
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
Electronic adaptation of stereophonic sound system to listener position or orientation (H04S7/301 takes precedence) · CPC title
Positioning of individual sound objects, e.g. moving airplane, within a sound field (H04S2420/13 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.