Method and apparatus for encoding and decoding successive frames of an ambisonics representation of a 2- or 3-dimensional sound field

US9397771B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9397771-B2
Application numberUS-201113333461-A
CountryUS
Kind codeB2
Filing dateDec 21, 2011
Priority dateDec 21, 2010
Publication dateJul 19, 2016
Grant dateJul 19, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Representations of spatial audio scenes using higher-order Ambisonics HOA technology typically require a large number of coefficients per time instant. This data rate is too high for most practical applications that require real-time transmission of audio signals. According to the invention, the compression is carried out in spatial domain instead of HOA domain. The (N+1) 2 input HOA coefficients are transformed into (N+1) 2 equivalent signals in spatial domain, and the resulting (N+1) 2 time-domain signals are input to a bank of parallel perceptual codecs. At decoder side, the individual spatial-domain signals are decoded, and the spatial-domain coefficients are transformed back into HOA domain in order to recover the original HOA representation.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for carrying out an encoding on received successive frames of a higher-order Ambisonics representation of a 2- or 3-dimensional sound field, denoted as_HOA coefficients, said method comprising: transforming a number of O=(N+1) 2 input HOA coefficients of a frame into a number of O spatial domain signals representing a regular distribution of reference points on a sphere, wherein N is an order of said input HOA coefficients and is greater or equal to 3, and each one of said O spatial domain signals represents a set of plane waves which come from associated directions in space; encoding each one of said O spatial domain signals using perceptual compression encoding steps or stages, thereby using encoding parameters selected such that a coding error is inaudible; and multiplexing the encoded spatial domain signals of the frame into a joint bit stream for providing improved lossy compression of HOA representations of audio scenes. 2. The method according to claim 1 , wherein a masking used in said perceptual compression encoding is a psycho-acoustic masking and is a combination of time-frequency masking and spatial masking. 3. The method according to claim 1 , wherein said transforming into O spatial domain signals is plane wave decomposition. 4. The method according to claim 1 , wherein said encoding of each of said O spatial domain signals corresponds to the MPEG-1 Audio Layer III or AAC or Dolby AC-3 standard. 5. An apparatus for carrying out an encoding on received successive frames of a higher order Ambisonics representation of a 2- or 3-dimensional sound field, denoted as HOA coefficients, said apparatus comprising: a transformer configured to transform a number O=(N+1) 2 input HOA coefficients of a frame into a number of O spatial domain signals representing a regular distribution of reference points on a sphere, wherein N is an order of said input HOA coefficients and is greater or equal to 3, and each one of said spatial domain signals represents a set of plane waves which come from associated directions in space; encoders configured to encode each one of said O spatial domain signals using perceptual compression encoding steps or stages, thereby using encoding parameters selected such that a coding error is inaudible; and a hardware multiplexer configured to multiplex the encoded spatial domain signals of the frame into a joint bit stream for providing improved lossy compression of HOA representations of audio scenes. 6. The apparatus according to claim 5 , wherein a masking used in said perceptual compression encoding is a psycho-acoustic masking and is a combination of time-frequency masking and spatial masking. 7. The apparatus according to claim 5 , wherein said transformation is a plane wave decomposition. 8. The apparatus according to claim 5 , wherein said perceptual encoding corresponds to the MPEG-1 Audio Layer III or AAC or Dolby AC-3 standard. 9. A method for decoding received successive frames of a perceptual compression encoded higher-order Ambisonics representation of a 2- or 3-dimensional sound field, which was encoded according to claim 1 , said decoding comprising: de-multiplexing a received joint bit stream into a number of O=(N+1) 2 perceptual compression encoded spatial domain signals; decoding each one of said O encoded spatial domain signals into a corresponding decoded spatial domain signal using perceptual compression decoding steps or stages corresponding to a selected encoding type and using decoding parameters matching the encoding parameters, wherein said O decoded spatial domain signals represent a regular distribution of reference points on a sphere; and transforming said O decoded spatial domain signals into O output HOA coefficients of a frame, wherein N is an order of said output HOA coefficients for providing improved lossy compression of HOA representations of audio scenes. 10. The method according to claim 9 , wherein said decoding of each one of said O encoded spatial domain signals corresponds to the MPEG-1 Audio Layer III or AAC or Dolby AC-3 standard. 11. An apparatus for decoding received successive frames of a perceptual compression encoded higher-order Ambisonics representation of a 2- or 3-dimensional sound field, which was encoded according to claim 1 , said apparatus comprising: a hardware demultiplexer which demultiplexes a received joint bit stream into O=(N+1) 2 perceptual compression encoded spatial domain signals; decoders which decode each one of said O encoded spatial domain signals into a corresponding decoded spatial domain signal using perceptual compression decoding steps or stages corresponding to a selected encoding type and using decoding parameters matching the encoding parameters, wherein said O decoded spatial domain signals represent a regular distribution of reference points on a sphere; and a transformer transforming said O decoded spatial domain signals into O output HOA coefficients of a frame, wherein N is an order of said output HOA coefficients for providing improved lossy compression of HOA representations of audio scenes. 12. The apparatus according to claim 11 , wherein said decoding of each one of said O encoded spatial domain signals corresponds to the MPEG-1 Audio Layer III or AAC or Dolby AC-3 standard. 13. An apparatus for carrying out an encoding on received successive frames of a higher order Ambisonics representation of a 2- or 3-dimensional sound field, denoted as HOA coefficients, said apparatus comprising: a means for transforming a number O=(N+1) 2 input HOA coefficients of a frame into a number of O spatial domain signals representing a regular distribution of reference points on a sphere, wherein N is an order of said input HOA coefficients and is greater or equal to 3, and each one of said spatial domain signals represents a set of plane waves which come from associated directions in space; a means for encoding each one of said O spatial domain signals using perceptual compression encoding steps or stages, thereby using encoding parameters selected such that a coding error is inaudible; and a means for multiplexing the encoded spatial domain signals of the frame into a joint bit stream for providing improved lossy compression of HOA representations of audio scenes. 14. The apparatus according to claim 13 , wherein a means for masking used in said perceptual compression encoding is a psycho-acoustic masking and is a combination of time-frequency masking and spatial masking. 15. The apparatus according to claim 13 , wherein said means for transforming is a plane wave decomposition. 16. The apparatus according to claim 13 , wherein a means for said perceptual compression encoding corresponds to the MPEG-1 Audio Layer III or AAC or Dolby AC-3 standard. 17. An apparatus for decoding received successive frames of a perceptual compression encoded higher-order Ambisonics representation of a 2- or 3-dimensional sound field, which was encoded according to claim 1 , said apparatus comprising: a means for demultiplexing a received joint bit stream into O=(N+1) 2 perceptual compression encoded spatial domain signals; a means for decoding each one of said O encoded spatial domain signals into a corresponding decoded spatial domain signal using perceptual compression decoding steps or stages corresponding to a selected encoding type and using decoding parameters matching the encoding parameters, wherein said O decoded spatial domain signals represent a regular distribution of reference points on a sphere; and a means for transforming said O decod

Assignees

Inventors

Classifications

  • Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title

  • H04H20/89Primary

    using three or more audio channels, e.g. triphonic or quadraphonic · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9397771B2 cover?
Representations of spatial audio scenes using higher-order Ambisonics HOA technology typically require a large number of coefficients per time instant. This data rate is too high for most practical applications that require real-time transmission of audio signals. According to the invention, the compression is carried out in spatial domain instead of HOA domain. The (N+1) 2 input HOA coefficie…
Who is the assignee on this patent?
Jax Peter, Batke Johann-Markus, Boehm Johannes, and 2 more
What technology area does this patent fall under?
Primary CPC classification H04H20/89. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jul 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).