Personalized HRTFs via optical capture

US11778403B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11778403-B2
Application numberUS-201917263125-A
CountryUS
Kind codeB2
Filing dateJul 25, 2019
Priority dateJul 25, 2018
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus and method of generating personalized HRTFs. The system is prepared by calculating a model for HRTFs described as the relationship between a finite example set of input data, namely anthropometric measures and demographic information for a set of individuals, and a corresponding set of output data, namely HRTFs numerically simulated using a high-resolution database of 3D scans of the same set of individuals. At the time of use, the system queries the user for their demographic information, and then from a series of images of the user, the system detects and measures various anthropometric characteristics. The system then applies the prepared model to the anthropometric and demographic data as part of generating a personalized HRTF. In this manner, the personalized HRTF can be generated with more convenience than by performing a high-resolution scan or an acoustic measurement of the user, and with less computational complexity than by numerically simulating their HRTF.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: generating an HRTF calculation system taking anthropometric measurements and, optionally, demographic data of a user as input and returning a personalized HRTF for the user as output, obtaining 3D scans of a plurality of training subjects, generating HRTFs for the plurality of training subjects by performing acoustic scattering calculations on anatomical characteristics captured in the 3D scans, collecting anthropometric measurements and, optionally, demographic data of the plurality of training subjects, and training the HRTF calculation system to transform the anthropometric measurements and, optionally, demographic data of the plurality of training subjects to the HRTFs. 2. The method of claim 1 , wherein performing acoustic scattering calculations on anatomical characteristics captured in a 3D scan comprises performing a numerical simulation of the sound field around a mesh corresponding to the 3D scan. 3. The method of claim 1 , wherein the anthropometric measurements of the plurality of training subjects are determined from the 3D scans. 4. The method of claim 1 , wherein the training is performed using linear regression with Lasso regularization. 5. The method of claim 1 , further comprising: collecting a plurality of images of a user; using the plurality of images of the user to determine anthropometric measurements of the user, and inputting the anthropometric measurements of the user to the HRTF calculation system to obtain a personalized HRTF for the user. 6. The method of claim 5 , wherein the anthropometric measurements of the user are determined using a convolutional neural network. 7. The method of claim 1 , further comprising: generating an audio output by applying the personalized HRTF to an audio signal. 8. The method of claim 1 , further comprising: storing, by a server device, the personalized HRTF; and transmitting, by the server device, the personalized HRTF to a user device, wherein the user device generates an audio output by applying the personalized HRTF to an audio signal. 9. The method of claim 1 , wherein an audio signal comprises a plurality of audio objects that includes position information, the method further comprising: generating a binaural audio output by applying the personalized HRTF to the plurality of audio objects. 10. The method of claim 1 , the method further comprising: executing, by a server device that is configured to generate the personalized HRTF for the user using the HRTF calculation system, a photogrammetry component, a contextual transformation component, a landmark detection component, and an anthropometry component, wherein the photogrammetry component receives a plurality of structural imagery of the user, and generates a plurality of camera transforms and a structural image set using a structure-from-motion technique, wherein the contextual transformation component receives the plurality of camera transforms and the structural image set, and generates a transformed plurality of camera transforms by translating and rotating the plurality of camera transforms using the structural image set, wherein the landmark detection component receives the structural image set and the transformed plurality of camera transforms, and generates a 3D landmark set that corresponds to anthropometric characteristics of the user identified using the structural image set and the transformed plurality of camera transforms, wherein the anthropometry component receives the 3D landmark set, and generates anthropometric data from the 3D landmark set, wherein the anthropometric data corresponds to a set of distances and angles measured between individual landmarks of the 3D landmark set, and wherein the server device generates the personalized HRTF for the user by inputting the anthropometric data into the HRTF calculation system. 11. The method of claim 1 , the method further comprising: executing, by the server device that is configured to generate the personalized HRTF for the user using the HRTF calculation system, a landmark detection component, a 3D projection component, and an angle and distance measurement component, wherein the landmark detection component receives a cropped image set of anthropometric landmarks of the user, and generates a set of 2D coordinates of the set of anthropometric landmarks of the user from the cropped image set, wherein the 3D projection component receives the set of 2D coordinates and a plurality of camera transforms, and generates a set of 3D coordinates that correspond to the set of 2D coordinates of each of the anthropometric landmarks in 3D space using the camera transforms, wherein the angle and distance measurement component receives the set of 3D coordinates, and generates anthropometric data from the set of 3D coordinates, wherein the anthropometric data correspond to angles and distances of the anthropometric landmarks in the set of 3D coordinates, wherein the server device generates the personalized HRTF for the user by inputting the anthropometric data into the HRTF calculation system. 12. A non-transitory computer readable medium storing a computer program that, when executed by a processor, controls an apparatus to execute processing including the method of claim 1 . 13. An apparatus for generating head-related transfer functions (HRTFs), the apparatus comprising: at least one processor; and at least one memory, wherein the at least one processor is configured to control the apparatus to: generate an HRTF calculation system taking anthropometric measurements and, optionally, demographic data of a user as input and returning a personalized HRTF for the user as output, obtain 3D scans of a plurality of training subjects, generate HRTFs for the plurality of training subjects by performing acoustic scattering calculations on anatomical characteristics captured in the 3D scans, collect anthropometric measurements and, optionally, demographic data of the plurality of training subjects, and train the HRTF calculation system to transform the anthropometric measurements and, optionally, demographic data of the plurality of training subjects to the HRTFs. 14. The apparatus of claim 13 , wherein the at least one processor is further configured to control the apparatus to perform acoustic scattering calculations on anatomical characteristics captured in a 3D scan by performing a numerical simulation of the sound field around a mesh corresponding to the 3D scan. 15. The apparatus of claim 13 , wherein the at least one processor is further configured to control the apparatus to determine the anthropometric measurements of the plurality of training subjects from the 3D scans. 16. The apparatus of claim 13 , the apparatus further comprising: a user input device that is configured to collect a plurality of images of a user, wherein the at least one processor is further configured to use a plurality of images of a user captured by the user input device to determine anthropometric measurements of the user, and to input the anthropometric measurements of the user to the HRTF calculation system to obtain a personalized HRTF for the user. 17. The apparatus of claim 13 , further comprising: a user output device that is configured to generate an audio output by applying the personalized HRTF to an audio signal. 18. The apparatus of claim 13 , further comprising: a server device that is configured to generate the HRTF calculation system, to generate the personalized HRTF, to store the personalized HRTF, and to

Assignees

Inventors

Classifications

  • H04S7/301Primary

    Automatic calibration of stereophonic sound system, e.g. with test microphone · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Region-based segmentation · CPC title

  • Determining position or orientation of objects or cameras (camera calibration G06T7/80) · CPC title

  • Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11778403B2 cover?
An apparatus and method of generating personalized HRTFs. The system is prepared by calculating a model for HRTFs described as the relationship between a finite example set of input data, namely anthropometric measures and demographic information for a set of individuals, and a corresponding set of output data, namely HRTFs numerically simulated using a high-resolution database of 3D scans of t…
Who is the assignee on this patent?
Dolby Laboratories Licensing Corp
What technology area does this patent fall under?
Primary CPC classification H04S7/301. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).