Sound source estimation using neural networks
US-2017353789-A1 · Dec 7, 2017 · US
US10237649B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10237649-B2 |
| Application number | US-201715844847-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 18, 2017 |
| Priority date | Jul 12, 2016 |
| Publication date | Mar 19, 2019 |
| Grant date | Mar 19, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus relating to microphone devices and signal processing techniques are provided. In an example, a microphone device can detect sound, as well as enhance an ability to perceive at least a general direction from which the sound arrives at the microphone device. In an example, a case of the microphone device has an external surface which at least partially defines funnel-shaped surfaces. Each funnel-shaped surface is configured to direct the sound to a respective microphone diaphragm to produce an auralized multi-microphone output. The funnel-shaped surfaces are configured to cause direction-dependent variations in spectral notches and frequency response of the sound as received by the microphone diaphragms. A neural network can device-shape the auralized multi-microphone output to create a binaural output. The binaural output can be auralized with respect to a human listener.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: receiving neural network training data which is auralized with respect to a specific device; receiving an auralized multi-microphone recording, wherein the auralized multi-microphone recording is auralized with respect to the specific device which is not a simulated human head; receiving a binaural recording, wherein the binaural recording is captured using a simulated human head; applying the neural network training data to a neural network; creating a binaural output by device-shaping the received auralized multi-microphone recording with the neural network, wherein the binaural output is auralized with respect to a human listener; comparing the binaural output with the binaural recording to identify differences; and generating the neural network training data using the identified differences. 2. The method of claim 1 , wherein the neural network weighs and combines components of the auralized multi-microphone recording to create the binaural output. 3. The method of claim 1 , wherein the neural network training data includes at least one selected from the group consisting of: data describing effects on recorded sound by rooms of varying sizes, data describing reverberation times, and data generated by an auralization simulator. 4. The method of claim 1 , wherein the receiving the neural network training data includes receiving the neural network training data from a cloud-computing storage device, receiving the auralized multi-microphone recording from the cloud-computing storage device, or both. 5. The method of claim 1 , wherein the receiving of the auralized multi-microphone recording comprises receiving the auralized multi-microphone recording from a live stream. 6. The method of claim 1 , wherein the receiving of the auralized multi-microphone recording comprises receiving the auralized multi-microphone recording from a storage device. 7. The method of claim 1 , further comprising sending the binaural output to a binaural sound-reproducing device. 8. The method of claim 7 , wherein the binaural sound-reproducing device comprises a pair of headphones. 9. A non-transitory computer-readable medium, comprising: instructions stored by the non-transitory computer-readable medium, wherein the instructions are configured to cause a processor to: initiate receiving neural network training data which is auralized with respect to a specific device; initiate receiving an auralized multi-microphone recording, wherein the auralized multi-microphone recording is auralized with respect to the specific device which is not a simulated external human head; initiate receiving a binaural recording, wherein the binaural recording is captured using a simulated human head; initiate applying the neural network training data to a neural network; initiate creating a binaural output by device-shaping the received auralized multi-microphone recording with the neural network, wherein the binaural output is auralized with respect to a human listener; initiate comparing the binaural output with the binaural recording to identify differences; and initiate generating the neural network training data using the identified differences. 10. The non-transitory computer-readable medium of claim 9 , wherein the instructions configured to cause the processor to initiate creating the binaural output comprise instructions configured to cause the processor to weigh and combine components of the auralized multi-microphone input to create the binaural output. 11. The non-transitory computer-readable medium of claim 9 , wherein the instructions are further configured to cause the processor to: send the binaural output to a binaural sound-reproducing device. 12. The non-transitory computer-readable medium of claim 11 , wherein the binaural sound-reproducing device comprises a pair of headphones. 13. The non-transitory computer-readable medium of claim 9 , wherein the instructions configured to cause the processor to receive the neural network training data comprise instructions configured to cause the processor to receive the neural network training data from a storage device. 14. The non-transitory computer-readable medium of claim 13 wherein the storage device comprises a cloud-computing storage device. 15. The non-transitory computer-readable medium of claim 9 , wherein the instructions configured to cause the processor to receive the auralized multi-microphone recording comprise instructions configured to cause the processor to receive the auralized multi-microphone recording from a live stream. 16. The non-transitory computer-readable medium of claim 9 , wherein the instructions configured to cause the processor to receive the auralized multi-microphone recording comprise instructions configured to cause the processor to receive the auralized multi-microphone recording from a storage device. 17. The non-transitory computer-readable medium of claim 16 , wherein the storage device comprises a cloud-computing storage device. 18. The method of claim 1 , wherein the identified differences include differences in one or more notch frequencies occurring at a substantially similar direction. 19. The non-transitory computer-readable medium of claim 9 , wherein the identified differences include differences in one or more notch frequencies occurring at a substantially similar direction. 20. The method of claim 1 , wherein the neural network adjusts a neural network coefficient in the neural network training data to reduce the identified differences. 21. The non-transitory computer-readable medium of claim 9 , wherein the neural network adjusts a neural network coefficient in the neural network training data to reduce the identified differences.
2D or 3D arrays of transducers · CPC title
for microphones (H04R1/24, H04R1/26 take precedence) · CPC title
microphones · CPC title
Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's · CPC title
for microphones (H04R1/34 and H04R1/40 take precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.