Methods for Parametric Multi-Channel Encoding

US2016005407A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016005407-A1
Application numberUS-201414767883-A
CountryUS
Kind codeA1
Filing dateFeb 21, 2014
Priority dateFeb 21, 2013
Publication dateJan 7, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system ( 500 ) configured to generate a bitstream ( 564 ) indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal is described. The system ( 500 ) comprises a downmix processing unit ( 510 ) configured to generate the downmix signal from a multi-channel input signal ( 561 ); wherein the downmix signal comprises m channels and wherein the multi-channel input signal ( 561 ) comprises n channels; n, m being integers with m<n. Furthermore, the system ( 500 ) comprises a parameter processing unit ( 520 ) configured to determine the spatial metadata from the multi-channel input signal ( 561 ). In addition, the system ( 500 ) comprises a configuration unit ( 540 ) configured to determine one or more control settings for the parameter processing unit ( 520 ) based on one or more external settings; wherein the one or more external settings comprise a target data-rate for the bitstream ( 564 ) and wherein the one or more control settings comprise a maximum data-rate for the spatial metadata.

First claim

Opening claim text (preview).

1 - 44 . (canceled) 45 . An audio encoding system configured to generate a bitstream indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal; the system comprising a downmix processing unit configured to generate the downmix signal from a multi-channel input signal; wherein the downmix signal comprises m channels and wherein the multi-channel input signal comprises n channels; n, m being integers with m<n; a parameter processing unit configured to determine the spatial metadata from the multi-channel input signal; and a configuration unit configured to determine one or more control settings for the parameter processing unit based on one or more external settings; wherein the one or more external settings comprise a target data-rate for the bitstream and wherein the one or more control settings comprise a maximum data-rate for the spatial metadata. 46 . The audio encoding system of claim 45 , wherein the parameter processing unit is configured to determine spatial metadata for a frame of the multi-channel input signal, referred to as a spatial metadata frame; a frame of the multi-channel input signal comprises a pre-determined number of samples of the multi-channel input signal; and the maximum data-rate for the spatial metadata is indicative of a maximum number of metadata bits for a spatial metadata frame. 47 . The audio encoding system of claim 46 , wherein the parameter processing unit is configured to determine whether the number of bits of a spatial metadata frame which has been determined based on the one or more control settings exceeds the maximum number of metadata bits. 48 . The audio encoding system of claim 46 , wherein a spatial metadata frame comprises one or more sets of spatial parameters; the one or more control settings comprise a temporal resolution setting indicative of a number of sets of spatial parameters per spatial metadata frame to be determined by the parameter processing unit; the parameter processing unit is configured to discard a set of spatial parameters from a current spatial metadata frame, if the current spatial metadata frame comprises a plurality of sets of spatial parameters and if it is determined that the number of bits of the current spatial metadata frame exceeds the maximum number of metadata bits. 49 . The audio encoding system of claim 48 , wherein the one or more sets of spatial parameters are associated with corresponding one or more sampling points; the one or more sampling points are indicative of corresponding one or more time instants; the parameter processing unit is configured to discard a first set of spatial parameters from the current spatial metadata frame, wherein the first set of spatial parameters is associated with a first sampling point prior to a second sampling point, if the plurality of sampling points of the current metadata frame is not associated with transients of the multi-channel input signal; and the parameter processing unit is configured to discard the second set of spatial parameters from the current spatial metadata frame, if the plurality of sampling points of the current metadata frame is associated with transients of the multi-channel input signal. 50 . The audio encoding system of claim 48 , wherein the one or more control settings comprise a quantizer setting indicative of a first type of quantizer from a plurality of pre-determined types of quantizers; the parameter processing unit is configured to quantize the one or more sets of spatial parameters in accordance to the first type of quantizer; the plurality of pre-determined types of quantizers provides different quantizer resolutions, respectively; the parameter processing unit is configured to re-quantize one, some or all of the spatial parameters of the one or more sets of spatial parameters in accordance to a second type of quantizer having a lower resolution than the first type of quantizer, if it is determined that the number of bits of the current spatial metadata frame exceeds the maximum number of metadata bits. 51 . The audio encoding system of claim 48 , wherein the parameter processing unit is configured to determine a set of temporal difference parameters based on the difference of a current set of spatial parameters with respect to a directly preceding set of spatial parameters; encode the set of temporal difference parameters using entropy encoding; insert the encoded set of temporal difference parameters in the current spatial metadata frame; and reduce an entropy of the set of temporal difference parameters, if it is determined that the number of bits of the current spatial metadata frame exceeds the maximum number of metadata bits. 52 . The audio encoding system of claim 51 , wherein the parameter processing unit is configured to set one, some or all of the temporal difference parameters of the set of temporal difference parameters equal to a value having an increased probability of possible values of the temporal difference parameters, to reduce the entropy of the set of temporal difference parameters. 53 . The audio encoding system of claim 48 , wherein the one or more control settings comprise a frequency resolution setting; the frequency resolution setting is indicative of a number of different frequency bands; the parameter processing unit is configured to determine different spatial parameters, referred to as band parameters, for the different frequency bands; and a set of spatial parameters comprises corresponding band parameters for the different frequency bands. 54 . The audio encoding system of claim 53 , wherein the parameter processing unit is configured to determine a set of frequency difference parameters based on the difference of one or more band parameters in a first frequency band with respect to corresponding one or more band parameters in a second, adjacent, frequency band; encode the set of frequency difference parameters using entropy encoding; insert the encoded set of frequency difference parameters in the current spatial metadata frame; and reduce an entropy of the set of frequency difference parameters, if it is determined that the number of bits of the current spatial metadata frame exceeds the maximum number of metadata bits. 55 . The audio encoding system of claim 54 , wherein the parameter processing unit is configured to set one, some or all of the frequency difference parameters of the set of frequency difference parameters equal to a value having an increased probability of possible values of the frequency difference parameters, to reduce the entropy of the set of frequency difference parameters. 56 . The audio encoding system of claim 53 , wherein the parameter processing unit is configured to reduce the number of frequency bands, if it is determined that the number of bits of the current spatial metadata frame exceeds the maximum number of metadata bits; and re-determine the one or more sets of spatial parameters for the current spatial metadata frame using the reduced number of frequency bands. 57 . The audio encoding system of claim 45 , wherein the one or more external settings further comprise one or more of: a sampling rate of the multi-channel input signal, the number m of channels of the downmix signal, the number n of channels of the multi-channel input signal, and an update period indicative of a time period required by a corresponding decoding system to synchronize to the bitstream; and the one or more control settings further comprise one or more of: a temporal resolution setting indicative of a number of sets of spatial param

Assignees

Inventors

Classifications

  • Application of parametric coding in stereophonic audio systems · CPC title

  • Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes · CPC title

  • G10L19/008Primary

    Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title

  • in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title

  • Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016005407A1 cover?
The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system ( 500 ) configured to generate a bitstream ( 564 ) indicative of a downmix signal and spatial metadata for generating a multi-channel upmix signal from the downmix signal is described. The system ( …
Who is the assignee on this patent?
Dolby Int Ab
What technology area does this patent fall under?
Primary CPC classification G10L19/008. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 07 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).