Automatic loudspeaker directivity adaptation
US-2024236597-A1 · Jul 11, 2024 · US
US9622014B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9622014-B2 |
| Application number | US-201314409440-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 17, 2013 |
| Priority date | Jun 19, 2012 |
| Publication date | Apr 11, 2017 |
| Grant date | Apr 11, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are described for a method and system of rendering and playing back spatial audio content using a channel-based format. Spatial audio content that is played back through legacy channel-based equipment is transformed into the appropriate channel-based format resulting in the loss of certain positional information within the audio objects and positional metadata comprising the spatial audio content. To retain this information for use in spatial audio equipment even after the audio content is rendered as channel-based audio, certain metadata generated by the spatial audio processor is incorporated into the channel-based data. The channel-based audio can then be sent to a channel-based audio decoder or a spatial audio decoder. The spatial audio decoder processes the metadata to recover at least some positional information that was lost during the down-mix operation by upmixing the channel-based audio content back to the spatial audio content for optimal playback in a spatial audio environment.
Opening claim text (preview).
What is claimed is: 1. A method of recovering spatial audio information rendered into a channel-based format for playback in a first spatial audio environment, the channel-based format comprising a surround-sound format which includes a plurality of height channels, the first spatial audio environment including a plurality of height speakers and a plurality of additional height speakers, the method comprising: deriving metadata defining positional information of audio elements in a spatial audio processor that generates both channel-based and object-based information of the audio elements, wherein the metadata includes a matrix suitable for up-mixing a first set of channels to a second set of channels for playback by the plurality of height speakers and the plurality of additional height speakers in the first spatial audio environment, wherein the first set of channels comprises the plurality of height channels and the second set of channels comprises the plurality of height channels and the a plurality of additional height channels, and wherein the matrix is also suitable for down-mixing the first set of channels to a third set of channels for playback in a second spatial audio environment, wherein the second spatial audio environment includes no height speakers; incorporating the metadata in a channel-based format; combining the metadata and channel-based information in a spatial audio decoder to facilitate playback of the audio elements in the first spatial audio environment; and wherein the up-mixing matrix comprises a time-varying matrix of size M×2, and wherein the matrix is incorporated into the channel-based format with data specifying the number M corresponding to a total number of height channels of the second set of channels, and an assumed position of the M height channels. 2. The method of claim 1 wherein the channel-based format comprises a 7.1 surround-sound format. 3. The method of claim 1 , wherein the audio elements comprise audio objects that are transmitted to respective speakers whose positions correspond to those specified in the metadata. 4. The method of claim 1 wherein the up-mixing matrix is selected to minimize a defined cost function that is defined relative to a plurality of reference signals. 5. The method of claim 1 wherein the metadata supplements a first metadata set that includes metadata elements associated with an object-based stream of the spatial audio information, the metadata elements for each object-based stream specifying spatial parameters controlling the playback of a corresponding object-based sound, and comprising one or more of: sound position, sound width, and sound velocity; and further wherein the first metadata set includes metadata elements associated with a channel-based stream of the spatial audio information, and wherein the metadata elements associated with each channel-based stream comprises designations of surround-sound channels of the speakers in a speaker array in accordance with a defined surround-sound configuration. 6. The method of claim 5 wherein the first metadata set includes metadata to enable upmixing or downmixing of at least one of the channel-based audio streams and the object-based audio streams in accordance with a change from a first configuration of the speaker array to a second configuration of the speaker array. 7. The method of claim 6 wherein the speakers of the speaker array are placed at specific positions within the playback environment, and wherein metadata elements associated with each respective object-based stream specify that one or more sound components are rendered to a speaker feed for playback through a speaker nearest an intended playback location of the sound component, as indicated by the position metadata. 8. The method of claim 1 further comprising computing a plurality of height channel signals as a weighted sum of a corresponding plurality of audio objects defined by the spatial audio information. 9. The method of claim 8 wherein the positions associated with the plurality of height channels are static. 10. The method of claim 8 wherein the height channels are dynamic and the audio objects have a time-varying trajectory in a height plane. 11. The method of claim 10 further comprising deriving mixing coefficients corresponding to right and left front speaker heights, respectively as a function of trajectories relative to assumed speaker positions of two channels in the height plane. 12. The method of claim 11 further comprising deriving a weighted sum of the object trajectories, wherein the weights are a function of the mixing coefficients along with a loudness measure of each audio object. 13. The method of claim 12 further comprising defining the metadata elements using the mixing coefficients and weighted sum of the object trajectories. 14. The method of claim 1 further comprising identifying an inflection point along a front height axis to define a panning point at which sound is switched to or from front height speakers to rear surround speakers. 15. The method of claim 14 wherein any sound element located between the front height speakers and the inflection point will be collapsed to the front height speakers, and any sound element located between the inflection point and the rear height speakers will be stretched between the front height speakers and the rear surround speakers. 16. The method of claim 15 wherein the metadata comprises elements defining a position of the inflection point. 17. The method of claim 16 wherein the position of the inflection point is expressed as coordinates of an enclosure defined within the first spatial audio environment. 18. An apparatus for recovering spatial audio information rendered into a channel-based format for playback in a first spatial audio environment, the channel-based format comprising a surround-sound format which includes a plurality of height channels, the first spatial audio environment including a plurality of height speakers and a plurality of additional height speakers, the apparatus comprising a processor configured to: derive metadata defining positional information of audio elements in a spatial audio processor that generates both channel-based and object-based information of the audio elements, wherein the metadata includes a matrix suitable for up-mixing a first set of channels to a second set of channels for playback by the plurality of height speakers and the plurality of additional height speaker in the first spatial environment, wherein the first set of channels comprises the plurality of height channels and the second set of channels comprises the plurality of height channels and a plurality of additional height channels, and wherein the matrix is also suitable for down-mixing the first set of channels to a third set of channels for playback in a second spatial audio environment, wherein the second spatial audio environment includes no height speakers; incorporate the metadata in a channel-based format; combine the metadata and channel-based information in a spatial audio decoder to facilitate playback of the audio elements in the first spatial audio environment; and wherein the up-mixing matrix comprises a time-varying matrix of size M×2, and wherein the matrix is incorporated into the channel-based format with data specifying the number M corresponding to a total number of height channels of the second set of channels, and an assumed position of the M height channels. 19. A non-transitory storage medium recording a program of instructions that is executable by
Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1 (H04S2400/01 takes precedence) · CPC title
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
Electronic adaptation of stereophonic audio signals to reverberation of the listening space (H04S7/301 takes precedence) · CPC title
Application of parametric coding in stereophonic audio systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.