Spatialized audio output based on predicted position data
US-2017295446-A1 · Oct 12, 2017 · US
US9973874B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9973874-B2 |
| Application number | US-201715625927-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 16, 2017 |
| Priority date | Jun 17, 2016 |
| Publication date | May 15, 2018 |
| Grant date | May 15, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The methods and apparatus described herein optimally represent full 3D audio mixes (e.g., azimuth, elevation, and depth) as “sound scenes” in which the decoding process facilitates head tracking. Sound scene rendering can be performed for the listener's orientation (e.g., yaw, pitch, roll) and 3D position (e.g., x, y, z), and can be modified for a change in the listener's orientation or 3D position. As described below, the ability to render an audio object in both the near-field and far-field enables the ability to fully render depth of not just objects, but any spatial audio mix decoded with active steering/panning, such as Ambisonics, matrix encoding, etc., thereby enabling full translational head tracking (e.g., user movement) beyond simple rotation in the horizontal plane, or 6-degrees-of-freedom (6-DOF) tracking and rendering.
Opening claim text (preview).
What is claimed is: 1. A six-degrees-of-freedom sound source tracking method comprising: receiving a spatial audio signal, the spatial audio signal representing at least one sound source, the spatial audio signal including a reference orientation; receiving a 3-D motion input, the 3-D motion input representing a physical movement of a listener with respect to the at least one spatial audio signal reference orientation; generating a spatial analysis output based on the spatial audio signal; generating a signal forming output based on the spatial audio signal and the spatial analysis output; generating an active steering output based on the signal forming output, the spatial analysis output, and the 3-D motion input, the active steering output representing an updated apparent direction and distance of the at least one sound source caused by the physical movement of the listener with respect to the spatial audio signal reference orientation; and transducing an audio output signal based on the active steering output. 2. The method of claim 1 , wherein the physical movement of a listener includes at least one of a rotation and a translation. 3. The method of claim 2 , wherein receiving the 3-D motion input includes receiving the 3-D motion input from at least one of a head tracking device and a user input device. 4. The method of claim 1 , further including generating a plurality of quantized channels based on the active steering output, each of the plurality of quantized channels corresponding to a predetermined quantized depth. 5. The method of claim 1 , wherein the motion input includes a head-tracker motion. 6. The method of claim 1 , wherein the spatial audio signal includes the at least one Ambisonic soundfield. 7. The method of claim 6 , wherein: applying the spatial soundfield decoding includes analyzing the at least one Ambisonic soundfield based on a time-frequency soundfield analysis; and wherein the updated apparent direction of the at least one sound source is based on the time-frequency soundfield analysis. 8. The method of claim 7 , wherein applying the spatial soundfield decoding preserves height information. 9. A six-degrees-of-freedom sound source tracking system comprising: a processor configured to: receive a spatial audio signal, the spatial audio signal representing at least one sound source, the spatial audio signal including a reference orientation; receive a 3-D motion input from a motion input device, the 3-D motion input representing a physical movement of a listener with respect to the at least one spatial audio signal reference orientation; generate a spatial analysis output based on the spatial audio signal; generate a signal forming output based on the spatial audio signal and the spatial analysis output; and generate an active steering output based on the signal forming output, the spatial analysis output, and the 3-D motion input, the active steering output representing an updated apparent direction and distance of the at least one sound source caused by the physical movement of the listener with respect to the spatial audio signal reference orientation; and a transducer to transduce the audio output signal into an audible binaural output based on the active steering output. 10. The system of claim 9 , wherein the physical movement of a listener includes at least one of a rotation and a translation. 11. The system of claim 9 , wherein at least one of the plurality of spatial audio signal subsets includes an Ambisonic soundfield encoded audio signal. 12. The system of claim 11 , wherein the spatial audio signal includes at least one of a first order ambisonic audio signal, a higher order ambisonic audio signal, and a hybrid ambisonic audio signal. 13. The system of claim 11 , wherein the motion input device includes at least one of a head tracking device and a user input device. 14. The system of claim 9 , the processor further configured to generate a plurality of quantized channels based on the active steering output, each of the plurality of quantized channels corresponding to a predetermined quantized depth. 15. The system of claim 14 , wherein the transducer includes a headphone, wherein the processor is further configured to generate a binaural audio signal suitable for headphone reproduction from the plurality of quantized channels. 16. The system of claim 15 , wherein the transducer includes a loudspeaker, wherein the processor is further configured to generate a transaural audio signal suitable for loudspeaker reproduction by applying cross-talk cancellation. 17. The system of claim 9 , wherein the transducer includes a headphone, wherein the processor is further configured to generate a binaural audio signal suitable for headphone reproduction from the formed audio signal and the updated apparent direction. 18. At least one non-transitory machine-readable storage medium, comprising a plurality of instructions that, responsive to being executed with processor circuitry of a computer-controlled six-degrees- of-freedom sound source tracking device, cause the device to: receive a spatial audio signal, the spatial audio signal representing at least one sound source, the spatial audio signal including a reference orientation; receive a 3-D motion input, the 3-D motion input representing a physical movement of a listener with respect to the at least one spatial audio signal reference orientation; generate a spatial analysis output based on the spatial audio signal; generate a signal forming output based on the spatial audio signal and the spatial analysis output; generate an active steering output based on the signal forming output, the spatial analysis output, and the 3-D motion input, the active steering output representing an updated apparent direction and distance of the at least one sound source caused by the physical movement of the listener with respect to the spatial audio signal reference orientation; and transduce an audio output signal based on the active steering output. 19. The non-transitory machine-readable storage medium of claim 18 , wherein the physical movement of a listener includes at least one of a rotation and a translation. 20. The non-transitory machine-readable storage medium of claim 18 , the instructions further causing the device to generate a plurality of quantized channels based on the active steering output, each of the plurality of quantized channels corresponding to a predetermined quantized depth.
Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes · CPC title
Electronic adaptation of stereophonic audio signals to reverberation of the listening space (H04S7/301 takes precedence) · CPC title
Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved · CPC title
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
Application of parametric coding in stereophonic audio systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.