Statistical modelling, interpolation, measurement and anthropometry based prediction of head-related transfer functions

US9681250B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9681250-B2
Application numberUS-201414120522-A
CountryUS
Kind codeB2
Filing dateMay 27, 2014
Priority dateMay 24, 2013
Publication dateJun 13, 2017
Grant dateJun 13, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for generating and outputting three-dimensional audio data using head-related transfer functions (HRTFs) includes a processor configured to perform operations comprising: using a collection of previously measured HRTFs for audio signals corresponding to multiple directions for at least one subject; performing non-parametric Gaussian process hyper-parameter training on the collection of previously measured HRTFs to generate one or more predicted HRTFs that are different from the previously measured HRTFs; and generating and outputting three-dimensional audio data based on at least the one or more predicted HRTFs.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system for generating and outputting three-dimensional audio data using head-related transfer functions (HRTFs), the system comprising: a tangible, non-transitory memory communicating with a processor, the tangible, non-transitory memory having instructions stored thereon that, in response to execution by the processor, cause the processor to perform operations comprising: using a collection of previously measured HRTFs for audio signals corresponding to multiple directions for at least one subject; performing non-parametric Gaussian process hyper-parameter training on the collection of previously measured HRTFs to generate one or more predicted HRTFs that are different from the previously measured HRTFs; and generating and outputting three-dimensional audio data based on at least the one or more predicted HRTFs. 2. The system according to claim 1 , wherein the operation of performing Gaussian process hyper-parameter training on the collection of HRTFs further comprises causing the processor to perform operations that include: applying sparse Gaussian process regression to perform the Gaussian process hyper-parameter training on the collection of HRTFs. 3. The system of claim 2 , wherein the one or more predicted HRTFs are HRTFs for test directions not part of an original set of said multiple directions, and the method further comprises causing the processor to calculate a confidence interval for the one or more predicted HRTFs. 4. The system of claim 3 , further comprising causing the processor to perform an operation that includes: extracting extrema data from the one or more predicted HRTFs. 5. The system according to claim 1 , further comprising causing the processor to perform an operation that includes: accessing the collection of HRTFs to provide a data base of HRTF for autoencoder (AE) neural network (NN) learning; and learning an AE NN based on the collection of HRTFs accessed; and generating low-dimensional bottleneck AE features. 6. The system of claim 5 , further comprising causing the processor to perform an operation that includes: generating target directions; computing sound-source localization errors reflecting an argument; and accounting for the sound-source localization errors in a global minimization of the argument of the sound-source localization errors (SSLE). 7. The system of claim 6 , further comprising causing the processor to perform an operation that includes: decoding the argument of the sound-source localization errors to the one or more predicted HRTFs. 8. The system of claim 7 , further comprising causing the processor to perform an operation that includes: performing a listening test utilizing the one or more predicted HRTFs; reporting a localized direction as feedback input; recomputing the SSLE; and re-performing the global minimization of the argument of the SSLE. 9. The system of claim 8 , further comprising causing the processor to perform an operation that includes: generating a Gaussian process listener inference based upon the steps of decoding of the argument of the SSLE to the one or more predicted HRTFs, performing the listening test utilizing the one or more predicted HRTFs, and reporting the localized direction as feedback input. 10. The system of claim 1 , wherein the method further comprises causing the processor to perform operations that include: receiving HRTF measurements from different sources, and creating the one or more predicted HRTFs based on said HRTF measurement from different sources. 11. The system of claim 10 , further comprising causing the processor to perform an operation that includes: accessing a database HRTFs for the same individual in multiple directions; and accessing a database of HRTF test directions. 12. The system of claim 11 , further comprising causing the processor to perform an operation that includes: based on the accessing steps, implementing Gaussian process inference. 13. The system of claim 12 , further comprising causing the processor to perform an operation that includes: calculating confidence intervals for the one or more predicted HRTFs. 14. A method for generating and outputting three-dimensional audio data using head-related transfer functions (HRTF), the method comprising: collecting audio signals in a transform domain for at least one subject; applying head related transfer functions in multiple directions to the collected audio signals; performing non-parametric Gaussian hyper-parameter training on the collection of HRTFs to generate one or more predicted HRTFs; and generating and outputting three dimensional audio data based at least on the one or more predicted HRTFs. 15. The method according to claim 14 , further comprising causing the processor to perform an operation that includes: identifying an individual associated with the one or more predicted HRTFs. 16. The method according to claim 15 , wherein the step of performing Gaussian hyper-parameter training on the collection of HRTFs further comprises applying sparse Gaussian process regression to perform the Gaussian hyper-parameter training on the collection of HRTFs. 17. The method according to claim 16 , further comprising: applying HRTF test directions; and inferring Gaussian progression virtual listener measurements. 18. The method according to claim 17 , further comprising: calculating a confidence interval for the one or more predicted HRTFs. 19. The method according to claim 18 , further comprising: extracting extrema data from the predicted HRTFs.

Assignees

Inventors

Classifications

  • Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation · CPC title

  • H04S7/303Primary

    Tracking of listener position or orientation · CPC title

  • Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD] · CPC title

  • Aspects of sound capture and related signal processing for recording or reproduction · CPC title

  • H04S7/304Primary

    For headphones · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9681250B2 cover?
A system for generating and outputting three-dimensional audio data using head-related transfer functions (HRTFs) includes a processor configured to perform operations comprising: using a collection of previously measured HRTFs for audio signals corresponding to multiple directions for at least one subject; performing non-parametric Gaussian process hyper-parameter training on the collection of…
Who is the assignee on this patent?
Univ Maryland, Univ Maryland
What technology area does this patent fall under?
Primary CPC classification H04S7/303. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 13 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).