Apparatus for post-processing an audio signal using a transient location detection
US-2020020349-A1 · Jan 16, 2020 · US
US12300263B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12300263-B2 |
| Application number | US-202418982152-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 16, 2024 |
| Priority date | Apr 25, 2018 |
| Publication date | May 13, 2025 |
| Grant date | May 13, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
Opening claim text (preview).
The invention claimed is: 1. A method for performing high frequency reconstruction of an audio signal, the method comprising: receiving an encoded audio bitstream, the encoded audio bitstream including audio data representing a lowband portion of the audio signal and high frequency reconstruction metadata, wherein the encoded audio bitstream further includes a fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes the backward-compatible extension container, and wherein the identifier is a three bit unsigned integer transmitted most significant bit first and having a value of 0x6, wherein the fill data includes an extension payload, the extension payload includes spectral band replication extension data, and the extension payload is identified with a four bit unsigned integer transmitted most significant bit first and having a value of ‘1101’ or ‘1110’; decoding the audio data to generate a decoded lowband audio signal; extracting from the encoded audio bitstream the high frequency reconstruction metadata, the high frequency reconstruction metadata including operating parameters for a high frequency reconstruction process, the operating parameters including a patching mode parameter located in a backward-compatible extension container of the encoded audio bitstream, wherein a first value of the patching mode parameter indicates spectral translation and a second value of the patching mode parameter indicates harmonic transposition by phase-vocoder frequency spreading; filtering the decoded lowband audio signal to generate a filtered lowband audio signal; regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata, wherein the regenerating includes spectral translation if the patching mode parameter is the first value and the regenerating includes harmonic transposition by phase-vocoder frequency spreading if the patching mode parameter is the second value, and wherein, when the patching mode parameter equals the first value, the backward-compatible extension container further includes a flag indicating whether additional processing is used to avoid discontinuities in a shape of a spectral envelope of the highband portion, and the regenerating includes performing the additional preprocessing in response to a first value of the flag; and combining the filtered lowband audio signal with the regenerated highband portion to form a wideband audio signal, wherein the filtering, regenerating, and combining are performed as a post-processing operation with a delay of 3010 samples per audio channel, so that a composition time applies to a 3011-th audio sample within an audio composition unit. 2. The method of claim 1 wherein the harmonic transposition by phase-vocoder frequency spreading is performed with an estimated complexity at or below 4.5 million of operations per second and at or below 3 kWords of memory. 3. A non-transitory computer-readable medium having instructions which, when executed by a computing device or system, cause said computing device or system to execute the method of claim 1 . 4. An audio processing unit for performing high frequency reconstruction of an audio signal, the audio processing unit comprising: an input interface for receiving an encoded audio bitstream, the encoded audio bitstream including audio data representing a lowband portion of the audio signal and high frequency reconstruction metadata, wherein the encoded audio bitstream further includes a fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes the backward-compatible extension container, and wherein the identifier is a three bit unsigned integer transmitted most significant bit first and having a value of 0x6, wherein the fill data includes an extension payload, the extension payload includes spectral band replication extension data, and the extension payload is identified with a four bit unsigned integer transmitted most significant bit first and having a value of ‘1101’ or ‘1110’; a core audio decoder for decoding the audio data to generate a decoded lowband audio signal; a deformatter for extracting from the encoded audio bitstream the high frequency reconstruction metadata, the high frequency reconstruction metadata including operating parameters for a high frequency reconstruction process, the operating parameters including a patching mode parameter located in a backward-compatible extension container of the encoded audio bitstream, wherein a first value of the patching mode parameter indicates spectral translation and a second value of the patching mode parameter indicates harmonic transposition by phase-vocoder frequency spreading; an analysis filterbank for filtering the decoded lowband audio signal to generate a filtered lowband audio signal; a high frequency regenerator for reconstructing a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata, wherein the reconstructing includes a spectral translation if the patching mode parameter is the first value and the reconstructing includes harmonic transposition by phase-vocoder frequency spreading if the patching mode parameter is the second value, and wherein, when the patching mode parameter equals the first value, the backward-compatible extension container further includes a flag indicating whether additional processing is used to avoid discontinuities in a shape of a spectral envelope of the highband portion, and the regenerating includes performing the additional preprocessing in response to a first value of the flag; and a synthesis filterbank for combining the filtered lowband audio signal with the regenerated highband portion to form a wideband audio signal, wherein the analysis filterbank, the high frequency regenerator, and the synthesis filterbank are performed in a post-processor with a delay of 3010 samples per audio channel, so that a composition time applies to a 3011-th audio sample within an audio composition unit. 5. The audio processing unit of claim 4 wherein the harmonic transposition by phase-vocoder frequency spreading is performed with an estimated complexity at or below 4.5 million of operations per second and at or below 3 kWords of memory.
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
using spectral analysis, e.g. transform vocoders or subband vocoders · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Details of processing therefor · CPC title
Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.