Systems and methods for spatially enhanced audio communications
US-2024334148-A1 · Oct 3, 2024 · US
US9681250B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9681250-B2 |
| Application number | US-201414120522-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 27, 2014 |
| Priority date | May 24, 2013 |
| Publication date | Jun 13, 2017 |
| Grant date | Jun 13, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system for generating and outputting three-dimensional audio data using head-related transfer functions (HRTFs) includes a processor configured to perform operations comprising: using a collection of previously measured HRTFs for audio signals corresponding to multiple directions for at least one subject; performing non-parametric Gaussian process hyper-parameter training on the collection of previously measured HRTFs to generate one or more predicted HRTFs that are different from the previously measured HRTFs; and generating and outputting three-dimensional audio data based on at least the one or more predicted HRTFs.
Opening claim text (preview).
The invention claimed is: 1. A system for generating and outputting three-dimensional audio data using head-related transfer functions (HRTFs), the system comprising: a tangible, non-transitory memory communicating with a processor, the tangible, non-transitory memory having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations comprising: using a collection of previously measured HRTFs for audio signals corresponding to multiple directions for at least one subject; performing non-parametric Gaussian process hyper-parameter training on the collection of previously measured HRTFs to generate one or more predicted HRTFs that are different from the previously measured HRTFs; and generating and outputting three-dimensional audio data based on at least the one or more predicted HRTFs. 2. The system according to claim 1 , wherein the operation of performing Gaussian process hyper-parameter training on the collection of HRTFs further comprises causing the processor to perform operations that include: applying sparse Gaussian process regression to perform the Gaussian process hyper-parameter training on the collection of HRTFs. 3. The system of claim 2 , wherein the one or more predicted HRTFs are HRTFs for test directions not part of an original set of said multiple directions, and the method further comprises causing the processor to calculate a confidence interval for the one or more predicted HRTFs. 4. The system of claim 3 , further comprising causing the processor to perform an operation that includes: extracting extrema data from the one or more predicted HRTFs. 5. The system according to claim 1 , further comprising causing the processor to perform an operation that includes: accessing the collection of HRTFs to provide a data base of HRTF for autoencoder (AE) neural network (NN) learning; and learning an AE NN based on the collection of HRTFs accessed; and generating low-dimensional bottleneck AE features. 6. The system of claim 5 , further comprising causing the processor to perform an operation that includes: generating target directions; computing sound-source localization errors reflecting an argument; and accounting for the sound-source localization errors in a global minimization of the argument of the sound-source localization errors (SSLE). 7. The system of claim 6 , further comprising causing the processor to perform an operation that includes: decoding the argument of the sound-source localization errors to the one or more predicted HRTFs. 8. The system of claim 7 , further comprising causing the processor to perform an operation that includes: performing a listening test utilizing the one or more predicted HRTFs; reporting a localized direction as feedback input; recomputing the SSLE; and re-performing the global minimization of the argument of the SSLE. 9. The system of claim 8 , further comprising causing the processor to perform an operation that includes: generating a Gaussian process listener inference based upon the steps of decoding of the argument of the SSLE to the one or more predicted HRTFs, performing the listening test utilizing the one or more predicted HRTFs, and reporting the localized direction as feedback input. 10. The system of claim 1 , wherein the method further comprises causing the processor to perform operations that include: receiving HRTF measurements from different sources, and creating the one or more predicted HRTFs based on said HRTF measurement from different sources. 11. The system of claim 10 , further comprising causing the processor to perform an operation that includes: accessing a database HRTFs for the same individual in multiple directions; and accessing a database of HRTF test directions. 12. The system of claim 11 , further comprising causing the processor to perform an operation that includes: based on the accessing steps, implementing Gaussian process inference. 13. The system of claim 12 , further comprising causing the processor to perform an operation that includes: calculating confidence intervals for the one or more predicted HRTFs. 14. A method for generating and outputting three-dimensional audio data using head-related transfer functions (HRTF), the method comprising: collecting audio signals in a transform domain for at least one subject; applying head related transfer functions in multiple directions to the collected audio signals; performing non-parametric Gaussian hyper-parameter training on the collection of HRTFs to generate one or more predicted HRTFs; and generating and outputting three dimensional audio data based at least on the one or more predicted HRTFs. 15. The method according to claim 14 , further comprising causing the processor to perform an operation that includes: identifying an individual associated with the one or more predicted HRTFs. 16. The method according to claim 15 , wherein the step of performing Gaussian hyper-parameter training on the collection of HRTFs further comprises applying sparse Gaussian process regression to perform the Gaussian hyper-parameter training on the collection of HRTFs. 17. The method according to claim 16 , further comprising: applying HRTF test directions; and inferring Gaussian progression virtual listener measurements. 18. The method according to claim 17 , further comprising: calculating a confidence interval for the one or more predicted HRTFs. 19. The method according to claim 18 , further comprising: extracting extrema data from the predicted HRTFs.
Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation · CPC title
Tracking of listener position or orientation · CPC title
Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD] · CPC title
Aspects of sound capture and related signal processing for recording or reproduction · CPC title
For headphones · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.