Audio encoder and an audio decoder

US11929082B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11929082-B2
Application numberUS-201917290739-A
CountryUS
Kind codeB2
Filing dateOct 30, 2019
Priority dateNov 2, 2018
Publication dateMar 12, 2024
Grant dateMar 12, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to the field audio coding, an in particular to an audio decoder having at least two decoding modes, and associated decoding methods and decoding software for such audio decoder. In one of the decoding modes, at least one dynamic audio object is mapped to a set of static audio objects, the set of static audio objects corresponding to a predefined speaker configuration. The present disclosure further relates to a corresponding audio encoder, and associated encoding methods and encoding software for such audio encoder.

First claim

Opening claim text (preview).

What is claimed is: 1. An audio decoder comprising: one or more buffers for storing a received audio bitstream; and a controller coupled to the one or more buffers and configured: to operate in a decoding mode selected from a plurality of different decoding modes for decoding the received audio bitstream into one or more dynamic audio objects that are each to be rendered to a set of output audio channels, one or more static audio objects that are each to be rendered to the set of output audio channels, or a combination thereof, a dynamic audio object comprising a time-varying spatial position indicated by first metadata, and a static audio object comprising a static spatial position indicated by second metadata, the plurality of different decoding modes comprising a first decoding mode and a second decoding mode, wherein of the first and second decoding modes only the first decoding mode allows full decoding of one or more encoded dynamic audio objects in the bitstream into reconstructed individual dynamic audio objects that each comprise a respective time-varying spatial position; and in the second decoding mode: to access the received audio bitstream; to determine whether the received audio bitstream includes one or more dynamic audio objects; and responsive at least to determining that the received audio bitstream includes one or more dynamic audio objects, to map at least one of the one or more dynamic audio objects to a set of static audio objects without fully decoding the at least one of the one or more dynamic audio objects into respective reconstructed individual dynamic audio objects as is performed in the first decoding mode, the set of static audio objects each corresponding to a channel of a predefined immersive speaker configuration. 2. The audio decoder of claim 1 , wherein when the selected decoding mode is the second decoding mode, the controller is further configured to render the set of static audio objects to the set of output audio channels. 3. The audio decoder of claim 2 , wherein the audio bitstream comprises a first set of downmix coefficients, wherein the controller is configured to utilize the first set of downmix coefficients for rendering the set of static audio objects to the set of output audio channels. 4. The audio decoder of claim 3 , wherein the controller is further configured to receive information pertaining to attenuation applied in at least one of the one or more dynamic audio objects on an encoder side, wherein the controller is configured to modify the first set of downmix coefficients accordingly when utilizing the first set of downmix coefficients for rendering the set of static audio objects to the set of output audio channels. 5. The audio decoder of claim 3 , wherein the controller is further configured to receive information pertaining to a downmix operation performed on an encoder side, wherein the information defines an original channel configuration of first audio signal, wherein the downmix operation results in downmixing the first audio signal to the one or more dynamic audio objects, wherein the controller is configured to select a subset of the first set of downmix coefficients based on the information pertaining to the downmix information, wherein the utilizing of the first set of downmix coefficients for rendering the set of static audio objects to the set of output audio channels comprises utilizing the subset of the first set of downmix coefficients for rendering the set of static audio objects to the set of output audio channels. 6. The audio decoder of claim 2 , wherein the controller is configured to perform the mapping of the at least one of the one or more dynamic audio objects and the rendering of the set of static audio objects in a combined calculation using a single matrix, or wherein the controller is configured to perform the mapping of the at least one of the one or more dynamic audio objects and the rendering of the set of static audio objects in individual calculations using respective matrices. 7. The audio decoder of claim 1 , wherein the received audio bitstream comprises additional metadata identifying the at least one of the one or more dynamic audio objects. 8. The audio decoder of claim 7 , wherein the additional metadata indicates that N of the one or more dynamic audio objects are to be mapped to the set of static audio objects, wherein, responsive to the additional metadata, the controller is configured to map, to the set of static audio objects, N of the one or more dynamic audio objects selected from a predefined location or predefined locations in the received audio bitstream. 9. The audio decoder of claim 8 , wherein the one or more dynamic audio objects included in the received audio bitstream comprises more than N dynamic audio objects. 10. The audio decoder of claim 9 , wherein the one or more dynamic audio objects included in the received audio bitstream comprises the N dynamic audio objects and K further dynamic audio objects, wherein the controller is configured to render the set of static audio objects and the K further dynamic audio objects to the set of output audio channels. 11. The audio decoder of claim 8 , wherein, responsive to the additional metadata, the controller is configured to map, to the set of static audio objects, a first N of the one or more dynamic audio objects in the received audio bitstream, and/or wherein the set of static audio objects consists of M static audio objects, and M>N>0. 12. The audio decoder of claim 1 , wherein the received audio bitstream further comprises one or more further static audio objects, and/or wherein the predefined immersive speaker configuration is a 5.0.2 speaker configuration. 13. The audio decoder of claim 2 , wherein the set of output audio channels is one of: stereo output channels; 5.1 surround sound output channels, 5.1.2 immersive sound output channels; or 5.1.4 immersive sound output channels. 14. A method in a decoder comprising the steps of: receiving an audio bitstream and storing the received audio bitstream in one or more buffers, selecting a decoding mode from a plurality of different decoding modes for decoding the received audio bitstream into one or more dynamic audio objects that are each to be rendered to a set of output audio channels, one or more static audio objects that are each to be rendered to the set of output audio channels, or a combination thereof, a dynamic audio object comprising a time-varying spatial position indicated by first metadata, and a static audio object comprising a static spatial position indicated by second metadata, the plurality of different decoding modes comprising a first decoding mode and a second decoding mode, wherein of the first and second decoding modes only the first decoding mode allows full decoding of one or more encoded dynamic audio objects in the bitstream into reconstructed individual dynamic audio objects that each comprise a respective time-varying spatial position; operating a controller coupled to the one or more buffers in the selected decoding mode, when the selected decoding mode is the second decoding mode, the method further comprises the steps of: accessing, by the controller, the received audio bit stream; determining, by the controller, whether the received audio bitstream includes one or more dynamic audio objects; and responsive at least to determining that the received audio bitstream includes one or more dynamic audio objects, mapping, by the controller, at least one of the one or more dynamic audio objects to a set of static audio objects without fully decoding the at least one of the one or more dynamic audio ob

Assignees

Inventors

Classifications

  • G10L19/008Primary

    Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title

  • Vocoders using multiple modes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11929082B2 cover?
The present disclosure relates to the field audio coding, an in particular to an audio decoder having at least two decoding modes, and associated decoding methods and decoding software for such audio decoder. In one of the decoding modes, at least one dynamic audio object is mapped to a set of static audio objects, the set of static audio objects corresponding to a predefined speaker configurat…
Who is the assignee on this patent?
Dolby Int Ab
What technology area does this patent fall under?
Primary CPC classification G10L19/008. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 12 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).