Apparatus and method for low delay object metadata coding
US-9788136-B2 · Oct 10, 2017 · US
US11227616B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11227616-B2 |
| Application number | US-201916277851-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 15, 2019 |
| Priority date | Jul 22, 2013 |
| Publication date | Jan 18, 2022 |
| Grant date | Jan 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Audio encoder for encoding audio input data to obtain audio output data includes an input interface for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel including audio data of a channel and audio data of at least one object; a core encoder for core encoding core encoder input data; and a metadata compressor for compressing the metadata related to the one or more of the plurality of audio objects, wherein the audio encoder is configured to operate in at least one mode of the group of two modes.
Opening claim text (preview).
The invention claimed is: 1. An audio decoder for decoding encoded audio data, comprising: an input interface configured for receiving the encoded audio data, the encoded audio data comprising either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a mode controller configured for analyzing the encoded audio data to determine whether the encoded audio data comprise either a plurality of encoded audio channels and a plurality of encoded audio objects and compressed metadata related to the plurality of encoded audio objects, or a plurality of encoded audio channels without any encoded audio objects; a core decoder configured for either decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels and decoding the plurality of encoded audio objects received by the input interface to obtain decoded audio objects, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects, or decoding the plurality of encoded audio channels received by the input interface to obtain decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects; a metadata decompressor configured for decompressing the compressed metadata to obtain decompressed metadata, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; an object processor configured for processing the decoded audio objects using the decompressed metadata and the decoded audio channels to acquire a number of output audio channels comprising audio data from the decoded audio objects and the decoded audio channels, when the encoded audio data comprises the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects; a post processor configured for post processing the number of output audio channels to obtain an output format, wherein the mode controller is configured for controlling the audio decoder to either bypass the object processor and to feed the decoded audio channels as the output audio channels into the post processor, when the encoded audio data comprises the plurality of encoded audio channels without any encoded audio objects, or to feed the decoded audio objects and the decoded audio channels into the object processor, when the encoded audio data comprise the plurality of encoded audio channels and the plurality of encoded audio objects and the compressed metadata related to the plurality of encoded audio objects. 2. The audio decoder of claim 1 , wherein the post processor is configured for converting the number of output audio channels to a binaural representation as the output format or to a reproduction format as the output format, the reproduction format comprising a smaller number of reproduction audio channels than the number of output audio channels, and wherein the audio decoder is configured for controlling the post processor in accordance with a control input derived from an user interface or extracted from the encoded audio data received by the input interface. 3. The audio decoder of claim 1 , in which the object processor comprises: an object renderer configured for rendering the decoded audio objects using the decompressed metadata to obtain rendered audio objects; and a mixer configured for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels. 4. The audio decoder of claim 1 , wherein the plurality of encoded objects comprises one or more core encoded transport channels and associated parametric side information, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain the decoded audio objects comprising one or more core decoded transport channels and the associated parametric side information, wherein the object processor comprises a spatial audio object coding decoder configured for decoding the one or more core decoded transport channels and the associated parametric side information to obtain spatial audio object decoded audio objects, wherein the spatial audio object coding decoder is configured for rendering the spatial audio object decoded audio objects in accordance with rendering information related to a placement of the spatial audio object decoded audio objects to obtain rendered audio objects, and wherein the object processor is configured for mixing the rendered audio objects and the decoded audio channels to acquire the number of output audio channels. 5. The audio decoder of claim 1 , wherein the plurality of encoded audio objects comprises one or more core encoded transport channels and associated parametric side information representing the plurality of encoded audio objects, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain the decoded audio objects comprising one or more core decoded transport channels and the associated parametric side information, wherein the spatial audio object coding decoder is configured for transcoding the associated parametric side information and the decompressed metadata into transcoded parametric side information usable for directly rendering the output format, and wherein the post processor is configured for calculating output format audio channels of the output format using the one or more core decoded transport channels and the transcoded parametric side information. 6. The audio decoder of claim 1 , wherein the plurality of encoded audio objects comprises one or more core encoded transport channels and associated parametric data, wherein the core decoder is configured to decode the one or more core encoded transport channels to obtain one or more core decoded transport channels, wherein the object processor comprises a spatial audio object coding decoder configured for decoding the core decoded one or more transport channels outputted by the core decoder and the associated parametric data and the decompressed metadata to acquire a plurality of spatial audio object rendered audio objects, wherein the object processor comprises an object renderer configured for rendering the decoded audio objects outputted by the core decoder to obtain rendered decoded audio objects; wherein the object processor comprises a mixer for mixing the rendered decoded audio objects, the spatial audio object rendered audio objects, and the decoded audio channels to obtain mixer output audio channels, wherein the audio decoder further comprises an output interface configured for outputting the mixer output audio channels to loudspeakers, wherein the post processor furthermore comprises: a binaural renderer configured for rendering the mixer output audio channels into two binaural channels as the output format using head related transfer functions or binaural impulse responses, or a format converter configured for converting the mixer output audio channels into an output channel representation, as the output format, the output channel representation comprising a lower number of audio channels than the mixer output audio channels using information on a reproduction layout. 7. The audio decoder of claim 6 , wherein certain elements comprising the binaural renderer, the format converter, the mixer, the spatial audio object
using sound class specific coding, hybrid encoders or object based coding · CPC title
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Mode decision, i.e. based on audio signal content versus external parameters · CPC title
Noise substitution, i.e. substituting non-tonal spectral components by noisy source (comfort noise for discontinuous speech transmission G10L19/012) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.