Rendering and playback of spatial audio using channel-based audio systems

US9622014B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9622014-B2
Application numberUS-201314409440-A
CountryUS
Kind codeB2
Filing dateJun 17, 2013
Priority dateJun 19, 2012
Publication dateApr 11, 2017
Grant dateApr 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are described for a method and system of rendering and playing back spatial audio content using a channel-based format. Spatial audio content that is played back through legacy channel-based equipment is transformed into the appropriate channel-based format resulting in the loss of certain positional information within the audio objects and positional metadata comprising the spatial audio content. To retain this information for use in spatial audio equipment even after the audio content is rendered as channel-based audio, certain metadata generated by the spatial audio processor is incorporated into the channel-based data. The channel-based audio can then be sent to a channel-based audio decoder or a spatial audio decoder. The spatial audio decoder processes the metadata to recover at least some positional information that was lost during the down-mix operation by upmixing the channel-based audio content back to the spatial audio content for optimal playback in a spatial audio environment.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of recovering spatial audio information rendered into a channel-based format for playback in a first spatial audio environment, the channel-based format comprising a surround-sound format which includes a plurality of height channels, the first spatial audio environment including a plurality of height speakers and a plurality of additional height speakers, the method comprising: deriving metadata defining positional information of audio elements in a spatial audio processor that generates both channel-based and object-based information of the audio elements, wherein the metadata includes a matrix suitable for up-mixing a first set of channels to a second set of channels for playback by the plurality of height speakers and the plurality of additional height speakers in the first spatial audio environment, wherein the first set of channels comprises the plurality of height channels and the second set of channels comprises the plurality of height channels and the a plurality of additional height channels, and wherein the matrix is also suitable for down-mixing the first set of channels to a third set of channels for playback in a second spatial audio environment, wherein the second spatial audio environment includes no height speakers; incorporating the metadata in a channel-based format; combining the metadata and channel-based information in a spatial audio decoder to facilitate playback of the audio elements in the first spatial audio environment; and wherein the up-mixing matrix comprises a time-varying matrix of size M×2, and wherein the matrix is incorporated into the channel-based format with data specifying the number M corresponding to a total number of height channels of the second set of channels, and an assumed position of the M height channels. 2. The method of claim 1 wherein the channel-based format comprises a 7.1 surround-sound format. 3. The method of claim 1 , wherein the audio elements comprise audio objects that are transmitted to respective speakers whose positions correspond to those specified in the metadata. 4. The method of claim 1 wherein the up-mixing matrix is selected to minimize a defined cost function that is defined relative to a plurality of reference signals. 5. The method of claim 1 wherein the metadata supplements a first metadata set that includes metadata elements associated with an object-based stream of the spatial audio information, the metadata elements for each object-based stream specifying spatial parameters controlling the playback of a corresponding object-based sound, and comprising one or more of: sound position, sound width, and sound velocity; and further wherein the first metadata set includes metadata elements associated with a channel-based stream of the spatial audio information, and wherein the metadata elements associated with each channel-based stream comprises designations of surround-sound channels of the speakers in a speaker array in accordance with a defined surround-sound configuration. 6. The method of claim 5 wherein the first metadata set includes metadata to enable upmixing or downmixing of at least one of the channel-based audio streams and the object-based audio streams in accordance with a change from a first configuration of the speaker array to a second configuration of the speaker array. 7. The method of claim 6 wherein the speakers of the speaker array are placed at specific positions within the playback environment, and wherein metadata elements associated with each respective object-based stream specify that one or more sound components are rendered to a speaker feed for playback through a speaker nearest an intended playback location of the sound component, as indicated by the position metadata. 8. The method of claim 1 further comprising computing a plurality of height channel signals as a weighted sum of a corresponding plurality of audio objects defined by the spatial audio information. 9. The method of claim 8 wherein the positions associated with the plurality of height channels are static. 10. The method of claim 8 wherein the height channels are dynamic and the audio objects have a time-varying trajectory in a height plane. 11. The method of claim 10 further comprising deriving mixing coefficients corresponding to right and left front speaker heights, respectively as a function of trajectories relative to assumed speaker positions of two channels in the height plane. 12. The method of claim 11 further comprising deriving a weighted sum of the object trajectories, wherein the weights are a function of the mixing coefficients along with a loudness measure of each audio object. 13. The method of claim 12 further comprising defining the metadata elements using the mixing coefficients and weighted sum of the object trajectories. 14. The method of claim 1 further comprising identifying an inflection point along a front height axis to define a panning point at which sound is switched to or from front height speakers to rear surround speakers. 15. The method of claim 14 wherein any sound element located between the front height speakers and the inflection point will be collapsed to the front height speakers, and any sound element located between the inflection point and the rear height speakers will be stretched between the front height speakers and the rear surround speakers. 16. The method of claim 15 wherein the metadata comprises elements defining a position of the inflection point. 17. The method of claim 16 wherein the position of the inflection point is expressed as coordinates of an enclosure defined within the first spatial audio environment. 18. An apparatus for recovering spatial audio information rendered into a channel-based format for playback in a first spatial audio environment, the channel-based format comprising a surround-sound format which includes a plurality of height channels, the first spatial audio environment including a plurality of height speakers and a plurality of additional height speakers, the apparatus comprising a processor configured to: derive metadata defining positional information of audio elements in a spatial audio processor that generates both channel-based and object-based information of the audio elements, wherein the metadata includes a matrix suitable for up-mixing a first set of channels to a second set of channels for playback by the plurality of height speakers and the plurality of additional height speaker in the first spatial environment, wherein the first set of channels comprises the plurality of height channels and the second set of channels comprises the plurality of height channels and a plurality of additional height channels, and wherein the matrix is also suitable for down-mixing the first set of channels to a third set of channels for playback in a second spatial audio environment, wherein the second spatial audio environment includes no height speakers; incorporate the metadata in a channel-based format; combine the metadata and channel-based information in a spatial audio decoder to facilitate playback of the audio elements in the first spatial audio environment; and wherein the up-mixing matrix comprises a time-varying matrix of size M×2, and wherein the matrix is incorporated into the channel-based format with data specifying the number M corresponding to a total number of height channels of the second set of channels, and an assumed position of the M height channels. 19. A non-transitory storage medium recording a program of instructions that is executable by

Assignees

Inventors

Classifications

  • Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1 (H04S2400/01 takes precedence) · CPC title

  • H04S3/008Primary

    in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title

  • H04S7/305Primary

    Electronic adaptation of stereophonic audio signals to reverberation of the listening space (H04S7/301 takes precedence) · CPC title

  • Application of parametric coding in stereophonic audio systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9622014B2 cover?
Embodiments are described for a method and system of rendering and playing back spatial audio content using a channel-based format. Spatial audio content that is played back through legacy channel-based equipment is transformed into the appropriate channel-based format resulting in the loss of certain positional information within the audio objects and positional metadata comprising the spatial…
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp
What technology area does this patent fall under?
Primary CPC classification H04S3/008. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Apr 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).