Method and apparatus for generating from a multi-channel 2D audio input signal a 3D sound representation signal

US10341802B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10341802-B2
Application numberUS-201615768695-A
CountryUS
Kind codeB2
Filing dateNov 11, 2016
Priority dateNov 13, 2015
Publication dateJul 2, 2019
Grant dateJul 2, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Currently there is no simple and satisfying way to create 3D audio from existing 2D content. The conversion from 2D to 3D sound should spatially redistribute the sound from existing channels. From a multi-channel 2D audio input signal (x(k)(t)) a 3D sound representation is generated which includes an HOA representation Formula (I) and channel object signals Formula (II) scaled from channels of the 2D audio input signal. Additional signals Formula (III) placed in the 3D space are generated by scaling (21, 222; 41, 422; Formula (IV)) channels from the 2D audio input signal and by decorrelating (24, 25; 44, 45, 451; Formula (V)) a scaled version of a mix of channels from the 2D audio input signal, whereby spatial positions for the additional signals are predetermined. The additional signals Formula (III) are converted (27; 47) to a HOA representation Formula (I).

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for generating from a multi-channel 2D audio input signal a 3D sound representation which includes a Higher Order Ambisonics (HOA) representation and channel object signals, wherein said 3D sound representation is suited for a presentation with loudspeakers after rendering said HOA representation and combination with said channel object signals, said method including: generating each of said channel object signals by selecting and scaling one channel signal of said multi-channel 2D audio input signal; generating additional signals in a 3D space by scaling non-selected channels from said multi-channel 2D audio input signal or by decorrelating a scaled version of a mix of channels from said multi-channel 2D audio input signal, wherein spatial positions for the additional signals are predetermined; converting the additional signals to said HOA representation using the spatial positions corresponding to the additional signals. 2. The method according to claim 1 , wherein said spatial positions can vary over time and a number corresponding to the spatial positions can vary over time. 3. The method according to claim 1 , wherein said scaling is carried out by applying time-varying gain factors. 4. The method according to claim 1 , wherein said scaling is adjusted such that said 3D sound representation can be rendered with a loudness of said multi-channel 2D audio input signal. 5. The method according to claim 3 , wherein said gain factors are applied before said decorrelating. 6. The method according to claim 1 , wherein the multi-channel 2D audio input signal is replaced by multiple multi-channel 2D audio input signals, each representing one complementary component of a mixed multi-channel 2D audio input signal, and wherein each multi-channel 2D audio input signal is converted to an individual 3D sound representation signal using individual conversion parameters, and wherein the 3D sound representations are superposed to a final mixed 3D sound representation. 7. The method according to claim 1 , wherein multiple decorrelated signals are generated from one channel signal, or a mix of channel signals, of the multi-channel 2D audio input signal based on frequency domain processing, for example by fast convolution using at least one of an FFT and a filter bank, and wherein a frequency analysis of a common input signal is carried out only once and said frequency domain processing and frequency synthesis is applied for each output channel separately. 8. The method of claim 1 , wherein the additional signals are generated by scaling non-selected channels from said multi-channel 2D audio input signal or by de-correlating the scaled version of the mix of channels from said multi-channel 2D audio input signal. 9. An apparatus for generating from a multi-channel 2D audio input signal a 3D sound representation which includes a Higher Order Ambisonics (HOA) representation and channel object signals, wherein said 3D sound representation is suited for a presentation with loudspeakers after rendering said HOA representation and combination with said channel object signals, said apparatus comprising: a processor configured to generate each of said channel object signals by selecting and scaling one channel signal of said multi-channel 2D audio input signal; wherein the processor is further configured to generate additional signals for placing them in a 3D space by scaling non-selected channels from said multi-channel 2D audio input signal or by decorrelating a scaled version of a mix of channels from said multi-channel 2D audio input signal, wherein spatial positions for said additional signals are predetermined; wherein the processor is further configured to convert said additional signals to said HOA representation using corresponding spatial positions. 10. The apparatus of claim 9 , the processor is further configured to generate the additional signals by scaling non-selected channels from said multi-channel 2D audio input signal or by de-correlating the scaled version of the mix of channels from said multi-channel 2D audio input signal. 11. The apparatus of claim 9 , wherein the processor is further configured to generate additional signals for placing them in the 3D space by scaling remaining non-selected channels from said multi-channel 2D audio input signal or by de-correlating the scaled version of the mix of channels from said multi-channel 2D audio input signal, wherein spatial positions for said additional signals are predetermined. 12. The apparatus according to claim 10 , wherein said spatial positions can vary over time and a number corresponding to the spatial positions can vary over time. 13. The apparatus according to claim 10 , wherein said scaling is carried out by applying time-varying gain factors. 14. The apparatus according to claim 9 , wherein the scaling is adjusted such that said 3D sound representation can be rendered with a loudness of said multi-channel 2D audio input signal. 15. The apparatus according to claim 9 , wherein said gain factors are applied before said decorrelating. 16. The apparatus according to claim 9 , wherein the multi-channel 2D audio input signal is replaced by multiple multi-channel 2D audio input signals, each representing one complementary component of a mixed multi-channel 2D audio input signal, and wherein each multi-channel 2D audio input signal is converted to an individual 3D sound representation signal using individual conversion parameters, and wherein the 3D sound representations are superposed to a final mixed 3D sound representation. 17. The apparatus according to claim 9 , wherein multiple decorrelated signals are generated from one channel signal, or a mix of channel signals, of the multi-channel 2D audio input signal based on frequency domain processing, for example by fast convolution using at least an FFT and a filter bank, and a frequency analysis of a common input signal is carried out only once and said frequency domain processing and frequency synthesis is applied for each output channel separately. 18. A non-transitory computer-readable storage medium storing instructions which, when executed by a processor, perform the method according to claim 1 .

Assignees

Inventors

Classifications

  • Application of ambisonics in stereophonic audio systems · CPC title

  • Positioning of individual sound objects, e.g. moving airplane, within a sound field (H04S2420/13 takes precedence) · CPC title

  • H04S7/303Primary

    Tracking of listener position or orientation · CPC title

  • in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title

  • Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10341802B2 cover?
Currently there is no simple and satisfying way to create 3D audio from existing 2D content. The conversion from 2D to 3D sound should spatially redistribute the sound from existing channels. From a multi-channel 2D audio input signal (x(k)(t)) a 3D sound representation is generated which includes an HOA representation Formula (I) and channel object signals Formula (II) scaled from channels of …
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp
What technology area does this patent fall under?
Primary CPC classification H04S7/303. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jul 02 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).