Reconstruction of Audio Scenes from a Downmix
US-2016111099-A1 · Apr 21, 2016 · US
US2016133267A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016133267-A1 |
| Application number | US-201615002148-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 20, 2016 |
| Priority date | Jul 22, 2013 |
| Publication date | May 12, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Audio encoder for encoding audio input data to obtain audio output data includes an input interface for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel including audio data of a channel and audio data of at least one object; a core encoder for core encoding core encoder input data; and a metadata compressor for compressing the metadata related to the one or more of the plurality of audio objects, wherein the audio encoder is configured to operate in at least one mode of the group of two modes.
Opening claim text (preview).
1 . Audio encoder for encoding audio input data to acquire audio output data comprising: an input interface configured for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer configured for mixing the plurality of objects and the plurality of channels to acquire a plurality of pre-mixed channels, each pre-mixed channel comprising audio data of a channel and audio data of at least one object; a core encoder configured for core encoding core encoder input data; and a metadata compressor configured for compressing the metadata related to the one or more of the plurality of audio objects, wherein the audio encoder is configured to operate in both modes of a group of at least two modes comprising a first mode, in which the core encoder is configured to encode the plurality of audio channels and the plurality of audio objects received by the input interface as core encoder input data, and a second mode, in which the core encoder is configured for receiving, as the core encoder input data, the plurality of pre-mixed channels generated by the mixer and to encode the plurality of pre-mixed channels. 2 . Audio encoder of claim 1 , further comprising: a spatial audio object encoder for generating one or more transport channels and parametric data from spatial audio object encoder input data, wherein the audio encoder is configured to additionally operate in a third mode, in which the core encoder encodes the one or more transport channels derived from the spatial audio object encoder input data, the spatial audio object encoder input data comprising the plurality of audio objects or two or more of the plurality of audio channels. 3 . Audio encoder of claim 1 , further comprising: a spatial audio object encoder for generating one or more transport channels and parametric data from spatial audio object encoder input data, wherein the audio encoder is configured to additionally operate in an even further mode, in which the core encoder encodes transport channels derived by the spatial audio object encoder from the pre-mixed channels as the spatial audio object encoder input data. 4 . Audio encoder of claim 1 , further comprising a connector for connecting an output of the input interface to an input of the core encoder in the first mode and for connecting the output of the input interface to an input of the mixer and to connect an output of the mixer to the input of the core encoder in the second mode, and a mode controller for controlling the connector in accordance with a mode indication received from a user interface or being extracted from the audio input data. 5 . Audio encoder of claim 1 , further comprising: an output interface for providing an output signal as the audio output data, the output signal comprising, in the first mode, an output of the core encoder and compressed metadata, and comprising, in the second mode, an output of the core encoder without any metadata, and comprising, in the third mode, an output of the core encoder, SAOC side information and the compressed metadata and comprising, in the even further mode, an output of the core encoder and SAOC side information. 6 . Audio encoder of claim 1 , wherein the mixer is configured for pre-rendering the plurality of audio objects using the metadata and an indication of the position of each channel in a replay setup, to which the plurality of channels are associated with, wherein the mixer is configured to mix an audio object with at least two audio channels and with this then the total number of audio channels, when the audio object is to be placed between the at least two audio channels in the replay setup, as determined by the metadata. 7 . Audio encoder of claim 1 , further comprising a metadata decompressor for decompressing compressed metadata output by the metadata compressor, and wherein the mixer is configured to mix the plurality of objects in accordance with decompressed metadata, wherein a compression operation performed by the metadata compressor is a lossy compression operation comprising a quantization step. 8 . Audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising a plurality of encoded channels or a plurality of encoded objects or compressed metadata related to the plurality of objects; a core decoder configured for decoding the plurality of encoded channels and the plurality of encoded objects; a metadata decompressor configured for decompressing the compressed metadata, an object processor configured for processing the plurality of decoded objects using the decompressed metadata to acquire a number of output channels comprising audio data from the objects and the decoded channels; and a post processor configured for converting the number of output channels into an output format, wherein the audio decoder is configured to bypass the object processor and to feed a plurality of decoded channels into the postprocessor, when the encoded audio data does not comprise any audio objects and to feed the plurality of decoded objects and the plurality of decoded channels into the object processor, when the encoded audio data comprises encoded channels and encoded objects. 9 . Audio decoder of claim 8 , wherein the postprocessor is configured to convert the number of output channels to a binaural representation or to a reproduction format comprising a smaller number of channels than the number of output channels, wherein the audio decoder is configured to control the postprocessor in accordance with control input derived from user interface or extracted from the encoded audio signal. 10 . Audio decoder of claim 8 , in which the object processor comprises: an object renderer for rendering decoded objects using decompressed metadata; and a mixer for mixing rendered objects and decoded channels to acquire the number of output channels. 11 . Audio decoder of claim 8 , wherein the object processor comprises: a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the audio objects and to control the object processor to mix the rendered audio objects and the decoded audio channels to acquire the number of output channels. 12 . Audio decoder of claim 8 , wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects and encoded audio channels, wherein the spatial audio object coding decoder is configured to decode the encoded audio objects and the encoded audio channels using the one or more transport channels and the parametric side information and wherein the object processor is configured to render the plurality of audio objects using the decompressed metadata and to decode the channels and mix them with the rendered objects to acquire the number of output channels. 13 . Audio decoder of claim 8 , wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects or encoded audio channels, wherein the spatial audio object coding decoder is configured to transcode the associated parametric information and the decompressed metadata
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1 (H04S2400/01 takes precedence) · CPC title
using sound class specific coding, hybrid encoders or object based coding · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Positioning of individual sound objects, e.g. moving airplane, within a sound field (H04S2420/13 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.