Method and apparatus for normalized audio playback of media with and without embedded loudness metadata on new media devices
US-2015332685-A1 · Nov 19, 2015 · US
US2016293174A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016293174-A1 |
| Application number | US-201615083717-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 29, 2016 |
| Priority date | Apr 5, 2015 |
| Publication date | Oct 6, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device includes a receiver configured to receive an audio frame of an audio stream. The device also includes a decoder configured to generate first decoded speech associated with the audio frame and to determine a count of audio frames classified as being associated with band limited content. The decoder is further configured to output second decoded speech based on the first decoded speech. The second decoded speech may be generated according to an output mode of the decoder. The output mode may be selected based at least in part on the count of audio frames.
Opening claim text (preview).
What is claimed is: 1 . A device comprising: a receiver configured to receive an audio frame of an audio stream; and a decoder configured to generate first decoded speech associated with the audio frame and to determine a count of audio frames classified as being associated with band limited content, wherein an output mode of the decoder is selected based at least in part on the count of audio frames, the decoder further configured to output second decoded speech based on the first decoded speech, the second decoded speech generated according to the output mode. 2 . The device of claim 1 , wherein the decoder is configured to classify the audio frame as a narrowband frame or a wideband frame, and wherein a classification of a narrowband frame corresponds to being associated with the band limited content. 3 . The device of claim 1 , wherein the second decoded speech corresponds to the first decoded speech when the output mode comprises a wideband mode. 4 . The device of claim 1 , wherein the second decoded speech includes a portion of the first decoded speech when the output mode comprises a narrowband mode. 5 . The device of claim 1 , wherein the decoder includes a detector configured to select the output mode based on a metric value, a number of consecutive audio frames that are classified as being associated with wideband content, or both. 6 . The device of claim 1 , wherein the decoder includes: a classifier configured to classify the audio frame as being associated with wideband content or the band limited content; and a tracker configured to maintain a record of one or more classifications generated by the classifier, wherein the tracker includes at least one of a buffer, a memory, or one or more counters. 7 . The device of claim 1 , wherein the receiver and the decoder are integrated into a mobile communication device or a base station. 8 . The device of claim 1 , further comprising: a demodulator coupled to the receiver, the demodulator configured to demodulate the audio stream; a processor coupled to the demodulator; and an encoder. 9 . The device of claim 8 , wherein the receiver, the demodulator, the processor, and the encoder are integrated into a mobile communication device. 10 . The device of claim 8 , wherein the receiver, the demodulator, the processor, and the encoder are integrated into a base station. 11 . A method of operating a decoder, the method comprising: generating, at a decoder, first decoded speech associated with an audio frame of an audio stream; determining an output mode of the decoder based at least in part on a number of audio frames classified as being associated with band limited content; and outputting second decoded speech based on the first decoded speech, the second decoded speech generated according to the output mode. 12 . The method of claim 11 , wherein the first decoded speech includes a low band component and a high band component. 13 . The method of claim 12 , further comprising: determining a ratio value that is based on a first energy metric associated with the low band component and a second energy metric associated with the high band component; comparing the ratio value to a classification threshold; and classifying the audio frame as being associated with the band limited content in response to the ratio value being greater than the classification threshold. 14 . The method of claim 13 , further comprising, when the audio frame is associated with the band limited content, attenuating the high band component of the first decoded speech to generate the second decoded speech. 15 . The method of claim 13 , further comprising, when the audio frame is associated with the band limited content, setting an energy value of one or more bands associated with the high band component to zero to generate the second decoded speech. 16 . The method of claim 11 , further comprising determining a first energy metric associated with a first set of multiple frequency bands associated with a low band component of the first decoded speech. 17 . The method of claim 16 , wherein determining the first energy metric comprises determining an average energy value of a subset of bands of the first set of multiple frequency bands and setting the first energy metric equal to the average energy value. 18 . The method of claim 16 , further comprising determining a second energy metric associated with a second set of multiple frequency bands associated with a high band component of the first decoded speech. 19 . The method of claim 18 , further comprising: determining a particular frequency band of the second set of multiple frequency bands having a highest detected energy value of the second set of multiple frequency bands; and setting the second energy metric equal to the highest detected energy value. 20 . The method of claim 18 , wherein the first set and the second set are mutually exclusive, and wherein each band of the second set of multiple frequency bands has the same bandwidth. 21 . The method of claim 20 , wherein the first set and the second set are separated by a transition band of a frequency range associated with the audio frame. 22 . The method of claim 11 , wherein, when the output mode comprises a wideband mode, the second decoded speech is substantially the same as the first decoded speech. 23 . The method of claim 11 , further comprising, when the output mode comprises a narrowband mode, maintaining a low band component of the first decoded speech and attenuating a high band component of the first decoded speech to generate the second decoded speech. 24 . The method of claim 11 , further comprising, when the output mode comprises a narrowband mode, attenuating one or more energy values of frequency bands associated with a high band component of the first decoded speech to generate the second decoded speech. 25 . The method of claim 11 , further comprising determining whether the audio frame is an active frame, wherein determining the output mode of the decoder is performed in response to determining that the audio frame is the active frame. 26 . The method of claim 11 , further comprising: receiving a second audio frame of the audio stream at the decoder; determining whether the second audio frame is an inactive frame; and maintaining the output mode of the decoder in response to determining that the second audio frame is the inactive frame. 27 . The method of claim 11 , further comprising: receiving multiple audio frames of the audio stream at the decoder, the multiple audio frames including the audio frame and a second audio frame; determining, at the decoder, a metric value corresponding to a relative count of audio frames of the multiple audio frames that are associated with the band limited content in response to receiving the second audio frame; selecting a threshold based on a first mode of the output mode of the decoder, the first mode associated with the audio frame received prior to the second audio frame; and updating the output mode from the first mode to a second mode based on a comparison of the metric value to the threshold, the second mode associated with the second audio frame. 28 . The method of claim 27 , wherein the metric value is determined as a percentage of the multiple audio frames that are classified as being ass
Pre-filtering or post-filtering · CPC title
by changing the amplitude · CPC title
Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes · CPC title
using spectral analysis, e.g. transform vocoders or subband vocoders · CPC title
Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.