Method and device for processing internal channels for low complexity format conversion
US-2018174594-A1 · Jun 21, 2018 · US
US12020718B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12020718-B2 |
| Application number | US-201917251940-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 2, 2019 |
| Priority date | Jul 2, 2018 |
| Publication date | Jun 25, 2024 |
| Grant date | Jun 25, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present document describes a method ( 500 ) for generating a bitstream ( 101 ), wherein the bitstream ( 101 ) comprises a sequence of superframes ( 400 ) for a sequence of frames of an immersive audio signal ( 111 ). The method ( 500 ) comprises, repeatedly for the sequence of superframes ( 400 ), inserting ( 501 ) coded audio data ( 206 ) for one or more frames of one or more downmix channel signals ( 203 ) derived from the immersive audio signal ( 111 ), into data fields ( 411, 421, 412, 422 ) of a superframe ( 400 ); and inserting ( 502 ) metadata ( 202, 205 ) for reconstructing one or more frames of the immersive audio signal ( 111 ) from the coded audio data ( 206 ), into a metadata field ( 403 ) of the superframe ( 400 ).
Opening claim text (preview).
It is claimed: 1. A method for generating a bitstream; wherein the bitstream comprises a sequence of superframes for a sequence of frames of an ambisonic immersive audio signal; wherein the method comprises, repeatedly for the sequence of superframes, inserting coded audio data for two or more frames of one or more downmix channel signals derived from the ambisonic immersive audio signal, into data fields of a superframe in the sequence of superframes in the bitstream, wherein each frame in the two or more frames in the superframe is coded with a respective data field designated to support frame level re-synchronization operations by a recipient device of the ambisonic immersive audio signal in case of bit errors in the ambisonic immersive audio signal; and inserting metadata for reconstructing one or more frames of the ambisonic immersive audio signal from the coded audio data into a metadata field of the superframe. 2. The method of claim 1 , wherein a. the method comprises inserting a header field into the superframe; and b. the header field is indicative of a size of the metadata field of the superframe. 3. The method of claim 2 , wherein a. the metadata field has a data size no greater than a maximum size; b. the header field is indicative of an adjustment value; and c. the size of the metadata field of the superframe corresponds to the maximum size minus the adjustment value. 4. The method of claim 2 , wherein a. the header field comprises a size indicator for the size of the metadata field; and b. the size indicator exhibits a different resolution for different size ranges of the size of the metadata field. 5. The method of claim 4 , wherein a. the metadata for reconstructing the one or more frames of the ambisonic immersive audio signal exhibits a statistical size distribution of the size of the metadata; and b. the resolution of the size indicator is dependent on the size distribution of the metadata. 6. The method of claim 1 , wherein a. the method comprises inserting a header field into the superframe; and b. the header field is indicative of whether or not the superframe comprises a configuration information field, and/or c. the header field is indicative of the presence of a configuration information field. 7. The method of claim 1 , wherein a. the method comprises inserting a configuration information field into the superframe; and b. the configuration information field is indicative of a number of downmix channel signals represented by the data fields of the superframe. 8. The method of claim 1 , wherein a. the method comprises inserting a configuration information field into the superframe; and b. the configuration information field is indicative of a maximum size of the metadata field. 9. The method of claim 1 , wherein a. the method comprises inserting a configuration information field into the superframe; and b. the configuration information field is indicative of an order of a soundfield representation signal comprised within the ambisonic immersive audio signal. 10. The method of claim 1 , wherein a. the method comprises inserting a configuration information field into the superframe; and b. the configuration information field is indicative of a frame type and/or a coding mode used for coding each one of the one or more downmix channel signals. 11. The method of claim 1 , wherein a. the method comprises inserting a header field into the superframe; and b. the header field is indicative of whether or not the superframe comprises an extension field for additional information regarding the ambisonic immersive audio signal. 12. The method of claim 1 , wherein a. the coded audio data of a frame of a downmix channel signal is generated using a multi-mode and/or multi-rate speech or audio codec; and/or b. the metadata is generated using a multi-mode and/or multi-rate immersive metadata coding scheme. 13. The method of claim 1 , wherein the coded audio data of a frame of a downmix channel signal is encoded using an Enhanced Voice Services encoder. 14. The method of claim 1 , wherein the superframe constitutes at least a part of a data element transmitted using a transmission protocol, notably DASH, RTSP or RTP, or stored in a file according to a storage format, notably ISOBMFF. 15. The method of claim 1 , wherein a. a header field is indicative that no configuration information field is present; and b. the method comprises conveying configuration information in a previous superframe of the sequence of superframes or using an out-of-band signaling scheme. 16. The method of claim 1 , wherein the one or more downmix channels signals include a first downmix channel signal and a second downmix channel signal; wherein the first downmix channel signal and the second downmix channel signal are derived from the ambisonic immersive audio signal; wherein the first downmix channel signal is encoded using a first encoder, and wherein the second downmix channel signal is encoded using a second encoder; wherein the method comprises providing configuration information regarding the first encoder and the second encoder within the superframe, within a previous superframe of the sequence of superframes or using an out-of-band signaling scheme. 17. The method of claim 1 , wherein the method comprises a. extracting one or more audio objects from the immersive audio, referred to as IA, signal; wherein an audio object comprises an object signal and object metadata indicating a position of the audio object; b. determining a residual signal based on the IA signal and based on the one or more audio objects; c. providing a downmix signal based on the IA signal, notably such that a number of downmix channel signals of the downmix signal is smaller than a number of channel signals of the IA signal; d. determining joint coding metadata for enabling upmixing of the downmix signal to one or more reconstructed audio object signals corresponding to the one or more audio objects and/or to a reconstructed residual signal corresponding to the residual signal; e. performing waveform coding of the downmix signal to provide coded audio data for a sequence of frames of the one or more downmix channel signals; and f. performing entropy coding of the joint coding metadata and of the object metadata of the one or more audio objects to provide the metadata to be inserted into the metadata fields of the sequence of superframes. 18. A method for deriving data regarding an ambisonic immersive audio signal from a bitstream; wherein the bitstream comprises a sequence of superframes for a sequence of frames of the ambisonic immersive audio signal; wherein the method comprises, repeatedly for the sequence of superframes, a. extracting coded audio data for two or more frames of one or more downmix channel signals derived from the ambisonic immersive audio signal, from data fields of a superframe in the sequence of superframes in the bitstream, wherein each frame in the two or more frames in the superframe is coded with a respective data field designated to support frame level re-synchronization operations by a recipient device of the ambisonic immersive audio signal in case of bit errors in the ambisonic immersive audio signal; and b. extracting metadata for reconstructing one or more frames of the immersive ambisonic audio signal from the coded audio data from a metadata field of the superframe. 19. The method of claim 18 , further comprising a. deriving one or more reconstructed audio objects from the coded audio
Application of parametric coding in stereophonic audio systems · CPC title
Application of ambisonics in stereophonic audio systems · CPC title
Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Dials; Mounting of dials · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.