Apparatus and method for low delay object metadata coding
US-9788136-B2 · Oct 10, 2017 · US
US11984131B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11984131-B2 |
| Application number | US-202117549413-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2021 |
| Priority date | Jul 22, 2013 |
| Publication date | May 14, 2024 |
| Grant date | May 14, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Audio encoder for encoding audio input data to obtain audio output data includes an input interface for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel including audio data of a channel and audio data of at least one object; a core encoder for core encoding core encoder input data; and a metadata compressor for compressing the metadata related to the one or more of the plurality of audio objects, wherein the audio encoder is configured to operate in at least one mode of the group of two modes.
Opening claim text (preview).
The invention claimed is: 1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of audio objects, or a plurality of encoded audio channels without any encoded audio objects, and a mode indication; a core decoder configured for decoding the plurality of encoded audio channels received by the input interface and the plurality of encoded audio objects received by the input interface to acquire a plurality of decoded audio channels and a plurality of decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to acquire a plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to acquire decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the plurality of decoded audio objects using the decompressed metadata and the plurality of decoded audio channels to acquire a number of output audio channels comprising audio data from the plurality of decoded audio objects and the plurality of decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; a mode controller connected to the input interface and configured for analyzing the encoded audio data to detect the mode indication indicating a first mode or a second mode, wherein, in the first mode, the encoded audio data comprise encoded audio channels and encoded audio objects, and wherein, in the second mode, the encoded audio data only comprise the plurality of encoded audio channels without any encoded audio objects; and a post-processor configured for converting the number of output audio channels into an output format, wherein the audio decoder, controlled by the mode controller, is configured to either bypass the object processor and to feed the plurality of decoded audio channels as the output audio channels into the post-processor, when the second mode has been detected by the mode controller, or to feed the plurality of decoded audio objects and the plurality of decoded audio channels into the object processor, when the first mode has been detected by the mode controller. 2. The audio decoder of claim 1 , wherein the post-processor is configured to convert the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of audio channels than the number of output audio channels, and wherein the audio decoder is configured to control the post-processor in accordance with a control input derived from a user interface or extracted from the encoded audio data. 3. The audio decoder of claim 1 , in which the object processor comprises: an object renderer for rendering the decoded audio objects to acquire rendered audio objects using the decompressed metadata; and a mixer for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels. 4. The audio decoder of claim 1 , wherein the object processor comprises: a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects, wherein the spatial audio object coding decoder is configured to render the decoded audio objects in accordance with rendering information related to a placement of the decoded audio objects to acquire rendered audio objects and to control the object processor to mix the rendered audio objects and the decoded audio channels to acquire the number of output audio channels. 5. The audio decoder of claim 1 , wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing one or more encoded audio objects of the plurality of encoded audio objects and one or more encoded audio channels of the plurality of encoded audio channels, wherein the spatial audio object coding decoder is configured to decode the one or more encoded audio objects and the one or more encoded audio channels using the one or more transport channels and the parametric side information to obtain one or more decoded audio objects and one or more decoded audio channels, and wherein the object processor is configured to render the one or more decoded audio objects using the decompressed metadata to acquire one or more rendered audio objects and to decode the one or more decoded audio channels and to mix the one or more decoded audio channels with the one or more rendered audio objects to acquire the number of output audio channels. 6. The audio decoder of claim 1 , wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels and associated parametric side information representing encoded audio objects or encoded audio channels, wherein the spatial audio object coding decoder is configured to transcode the associated parametric information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post-processor is configured for calculating audio channels of the output format using the decoded transport channels and the transcoded parametric side information, or wherein the spatial audio object coding decoder is configured to directly upmix and render channel signals for the output format using the decoded transport channels and the parametric side information. 7. The audio decoder of claim 1 , wherein the object processor comprises a spatial audio object coding decoder for decoding one or more transport channels output by the core decoder and associated parametric data and decompressed metadata to acquire a plurality of rendered audio objects, wherein the object processor comprises an object renderer being configured to render the decoded audio objects output by the core decoder to acquire rendered decoded audio objects; wherein the object processor is furthermore configured to mix the rendered decoded audio objects and the plurality of rendered audio objects with the decoded audio channels, wherein the audio decoder further comprises an output interface for outputting an output of the mixer to loudspeakers, wherein the post-processor furthermore comprises: a binaural renderer for rendering the output audio channels into two binaural channels using head related transfer functions or binaural impulse responses, the two binaural channels representing the binaural representation, and a format converter for converting the output audio channels into the output format comprising a lower number of audio channels than the output audio channels of the mixer using information on a reproduction layout. 8. The audio decoder of claim 1 , wherein the plurality of encoded audio channels or the plurality of encoded audio objects are enc
using sound class specific coding, hybrid encoders or object based coding · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Noise substitution, i.e. substituting non-tonal spectral components by noisy source (comfort noise for discontinuous speech transmission G10L19/012) · CPC title
Vocoders using multiple modes · CPC title
Mode decision, i.e. based on audio signal content versus external parameters · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.