Audio encoder and decoder with program loudness and boundary metadata

US9916838B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9916838-B2
Application numberUS-201414434528-A
CountryUS
Kind codeB2
Filing dateJan 15, 2014
Priority dateJan 21, 2013
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and methods for generating an encoded audio bitstream, including by including program loudness metadata and audio data in the bitstream, and optionally also program boundary metadata in at least one segment (e.g., frame) of the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, e.g., including by performing adaptive loudness processing of the audio data of an audio program indicated by the bitstream, or authentication and/or validation of metadata and/or audio data of such an audio program. Another aspect is an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method.

First claim

Opening claim text (preview).

What is claimed is: 1. An audio processing apparatus comprising: a buffer memory for storing at least one frame of an encoded audio bitstream, wherein the encoded audio bitstream includes a sequence of metadata segments and audio data segments and is indicative of audio data, the audio data segments are time-division multiplexed with the metadata segments, the audio data segments contain the audio data, at least one of the metadata segments is indicative of a metadata container, and the metadata container does not contain any of the audio data, wherein the metadata container includes a header, one or more metadata payloads, and protection data; a parser coupled to or integrated with the audio decoder for parsing the encoded audio bitstream, wherein the header includes a syncword identifying the start of the metadata container and a format version parameter after the syncword, wherein the format version parameter indicates a format version of the metadata container, the one or more metadata payloads describe an audio program associated with the audio data, the protection data is located after the one or more metadata payloads, and the protection data is for verifying the integrity of the metadata container and the one or more payloads within the metadata container, wherein the one or more metadata payloads include a program loudness payload that contains data indicative of a measured loudness of an audio program associated with the audio data; an audio decoder coupled to the buffer memory for decoding the audio data; and a post-processing unit coupled to or integrated with the audio decoder and configured to render the decoded audio data for playback by one or more speakers based on the program loudness. 2. The audio processing apparatus of claim 1 wherein the metadata container is stored in an AC-3 or E-AC-3 reserved data space selected from the group consisting of a skip field, an auxdata field, an addbsi field, and a combination thereof. 3. The audio processing apparatus of claim 1 wherein the one or more metadata payloads include metadata indicative of at least one boundary between consecutive audio programs. 4. The audio processing apparatus of claim 1 wherein the one or more metadata payloads include a program loudness payload that contains data indicative of a measured loudness of an audio program. 5. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates whether an audio channel contains spoken dialogue. 6. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates a loudness measurement method that has been used to generate loudness data contained in the program loudness payload. 7. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates whether a loudness of an audio program has been corrected using dialogue gating. 8. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates whether a loudness of an audio program has been corrected using an infinite look-ahead or file-based loudness correction process. 9. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates an integrated loudness of an audio program without any gain adjustments attributable to dynamic range compression. 10. The audio processing apparatus of claim 4 wherein the program loudness payload includes a field that indicates an integrated loudness of an audio program without any gain adjustments attributable to dialogue normalization. 11. The audio processing apparatus of claim 4 , wherein said decoding generates decoded audio data in response to the audio data, and said audio processing apparatus also includes: a processing subsystem, coupled to the audio decoder, and configured to perform adaptive loudness processing on the decoded audio data using the program loudness payload. 12. The audio processing apparatus according to claim 1 wherein the encoded audio bitstream is an AC-3 bitstream or an E-AC-3 bitstream. 13. The audio processing apparatus according to claim 4 , wherein the parser is configured to extract the program loudness payload from the encoded audio bitstream, and wherein the audio processing apparatus also includes: a processing subsystem, coupled to the parser, and configured to authenticate or validate the program loudness payload. 14. The audio processing apparatus according to claim 1 wherein the one or more metadata payloads each include a unique payload identifier, and the unique payload identifier is located at the beginning of each metadata payload. 15. The audio processing apparatus according to claim 1 wherein the syncword is a 16-bit syncword having a value of 0x5838. 16. A method for decoding an encoded audio bitstream, the method comprising: receiving an encoded audio bitstream, the encoded audio bitstream segmented into one or more frames, wherein the encoded audio bitstream includes a sequence of metadata segments and audio data segments and is indicative of audio data, the audio data segments are time-division multiplexed with the metadata segments, the audio data segments contain the audio data, at least one of the metadata segments is indicative of a metadata container, and the metadata container does not contain any of the audio data; extracting the audio data and the metadata container from the encoded audio bitstream, the metadata container including a header followed by one or more metadata payloads followed by protection data, wherein the header includes a syncword and a format version parameter after the syncword, wherein the format version parameter indicates a format version of the metadata container; and verifying the integrity of the metadata container and the one or more metadata payloads through the use of the protection data, wherein the one or more metadata payloads include a program loudness payload that contains data indicative of a measured loudness of an audio program associated with the audio data; decoding the audio data; and rendering the decoded audio data for playback by one or more speakers based on the program loudness. 17. The method of claim 16 , wherein the encoded bitstream is an AC-3 bitstream or an E-AC-3 bitstream. 18. The method of claim 16 further comprising: performing adaptive loudness processing on the audio data extracted from the encoded audio bitstream using the program loudness payload. 19. The method of claim 16 wherein the container is located in and extracted from an AC-3 or E-AC-3 reserved data space selected from the group consisting of a skip field, an auxdata field, an addbsi field, and a combination thereof. 20. The method of claim 16 wherein the program loudness payload includes a field that indicates whether an audio channel contains spoken dialogue.

Assignees

Inventors

Classifications

  • G10L19/167Primary

    Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes · CPC title

  • Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title

  • Vocoder architecture · CPC title

  • G10L19/002Primary

    Dynamic bit allocation (for perceptual audio coders G10L19/032) · CPC title

  • Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916838B2 cover?
Apparatus and methods for generating an encoded audio bitstream, including by including program loudness metadata and audio data in the bitstream, and optionally also program boundary metadata in at least one segment (e.g., frame) of the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, e.g., including by performing adaptive loudness processing of the audio data …
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp
What technology area does this patent fall under?
Primary CPC classification G10L19/167. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).