Processing spatially diffuse or large audio objects

US9654895B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9654895-B2
Application numberUS-201414909058-A
CountryUS
Kind codeB2
Filing dateJul 24, 2014
Priority dateJul 31, 2013
Publication dateMay 16, 2017
Grant dateMay 16, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, the decorrelated large audio object audio signals may be rendered to virtual or actual speaker locations. The output of such a rendering process may be input to a scene simplification process. The decorrelation, associating and/or scene simplification processes may be performed prior to a process of encoding the audio data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, comprising: receiving, in an input interface to an encoder component of an audio rendering system, audio data comprising audio objects, the audio objects comprising audio object signals and associated metadata, the associated metadata including at least audio object size data; determining, by a large object detection component based on the audio object size data, a large audio object having an audio object size that is greater than a threshold size, wherein the large audio object is spatially diffuse and requires a plurality of speakers to reproduce the large audio object; and performing, in a decorrelator component coupled to the input interface, a decorrelation process on audio signals of the large audio object to produce decorrelated large audio object audio signals that are dependent on a defined location of the large audio object and other information, wherein the decorrelated large audio object signals are mutually independent of one another, and the decorrelation process comprises adjusting a level of each of the audio signals by adjusting a respective audio gain for each of the audio signals to generate the decorrelated large audio object audio signals corresponding to a speaker feed to each speaker of the plurality of speakers, and further wherein the plurality of speakers covers a large spatial area. 2. The method of claim 1 , further comprising receiving decorrelation metadata for the large audio object, wherein the decorrelation metadata comprises an indicator that the audio object size is greater than the threshold size. 3. The method of claim 1 , wherein the large audio object has a plurality of object locations, wherein at least some of the plurality of object locations are one of: stationary locations or locations that vary over time. 4. The method of claim 1 , wherein the decorrelation process is performed upstream prior to a process of rendering the audio data for reproduction in a playback environment comprising a home theatre system. 5. The method of claim 1 , wherein the decorrelation process comprises one of: a delay process, an all-pass filter process, a pseudo-random filter process, and a reverberation process. 6. The method of claim 1 , wherein the plurality of speakers have a plurality of speaker locations, wherein the plurality of speaker locations comprise speaker zones defining virtual speaker locations arranged into one or more speaker zones. 7. The method of claim 6 , further comprising using a rendering tool to map the speaker feed to respective speaker zones. 8. The method of claim 1 , wherein the audio data comprise one or more audio bed signals corresponding to original speaker locations, the method further comprising outputting the decorrelated large audio object audio signals as additional audio bed signals or audio object signals for playback through the plurality of speakers. 9. The method of claim 1 wherein the respective audio gain for each of the audio signals comprises a gain factor determined according to an amplitude panning method. 10. The method of claim 1 , further comprising attenuating or deleting the audio signals of the large audio object after the decorrelation process is performed. 11. The method of claim 1 , further comprising retaining audio signals corresponding to a point source contribution of the large audio object after the decorrelation process is performed. 12. The method of claim 1 , wherein the large audio object comprises metadata including audio object position metadata, the method further comprising: computing contributions from virtual sources within an audio object area or volume defined by the audio object position metadata of the large audio object and the audio object size data; and determining a set of audio object gain values for each of a plurality of output channels based, at least in part, on the computed contributions. 13. The method of claim 1 , further comprising performing an audio object clustering process after the decorrelation process. 14. The method of claim 1 , further comprising evaluating the audio data to determine content type, wherein the decorrelation process is selectively performed according to the content type. 15. The method of claim 14 , wherein an amount of decorrelation to be performed depends on the content type. 16. The method of claim 1 , wherein the decorrelation process involves a complex, time-variant filter algorithm. 17. The method of claim 1 , wherein the large audio object comprises metadata including audio object position metadata, the method further comprising mixing the decorrelated large audio object audio signals with audio signals of audio objects that are spatially separated by a threshold amount of distance from the large audio object. 18. An apparatus including an audio rendering system, the apparatus comprising: an input interface of the audio rendering system receiving audio data comprising audio objects, the audio objects comprising audio object signals and associated metadata, the associated metadata including at least audio object size data; a processing component determining, based on the audio object size data, a large audio object having an audio object size that is greater than a threshold size, wherein the large audio object is spatially diffuse and requires a plurality of speakers to reproduce the large audio object; and a decorrelator component coupled to the input interface, performing a decorrelation process on audio signals of the large audio object to produce decorrelated large audio object audio signals that are dependent on a defined location of the large audio object and other information, wherein the decorrelated large audio object signals are mutually independent of one another, and the decorrelation process comprises adjusting a level of each of the audio signals by adjusting a respective audio gain for each of the audio signals to generate the decorrelated large audio object audio signals corresponding to a speaker feed to each speaker of the plurality of speakers, and further wherein the plurality of speakers covers a large spatial area. 19. A non-transitory medium having stored thereon programming instructions, which when executed by a processing component in an audio rendering system cause the audio rendering system to: receive, in an input interface to an encoder component of the audio rendering system, audio data comprising audio objects, the audio objects comprising audio object signals and associated metadata, the associated metadata including at least audio object size data; determine, by a large object detection component based on the audio object size data, a large audio object having an audio object size that is greater than a threshold size, wherein the large audio object is spatially diffuse and requires a plurality of speakers to reproduce the large audio object; and perform, in a decorrelator component coupled to the input interface, a decorrelation process on audio signals of the large audio object to produce decorrelated large audio object audio signals that are dependent on a defined location of the large audio object and other information, wherein the decorrelated large audio object signals are mutually independent of one another, and the decorrelation process comprises adjusting a level of each of the audio signals by adjusting a respective audio gain for each of the audio signals to generate the decorrelated large audio object audio signals corresponding to a speaker feed to each speaker of the plurality of speakers, and further wherein the p

Assignees

Inventors

Classifications

  • H04S7/308Primary

    Electronic adaptation dependent on speaker or headphone connection · CPC title

  • Application of parametric coding in stereophonic audio systems · CPC title

  • Audio watermarking, i.e. embedding inaudible data in the audio signal · CPC title

  • using sound class specific coding, hybrid encoders or object based coding · CPC title

  • Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution (control circuits for electronic adaptation of the sound field H04S7/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9654895B2 cover?
Diffuse or spatially large audio objects may be identified for special processing. A decorrelation process may be performed on audio signals corresponding to the large audio objects to produce decorrelated large audio object audio signals. These decorrelated large audio object audio signals may be associated with object locations, which may be stationary or time-varying locations. For example, …
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp, Dolby Int Ab
What technology area does this patent fall under?
Primary CPC classification H04S7/308. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 16 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).