Spatial audio enhancement apparatus

US9769588B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9769588-B2
Application numberUS-201314441322-A
CountryUS
Kind codeB2
Filing dateNov 18, 2013
Priority dateNov 20, 2012
Publication dateSep 19, 2017
Grant dateSep 19, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprising: a depth map estimator configured to determine, associated with at least one visual image, a depth map comprising at least one distance value in a direction relative to an apparatus; a direction of arrival estimator configured to determine, using at least two microphones, at least one audio source signal with a direction; and an audio signal processor configured to process the at least one audio source signal based on the at least one distance value in the direction of the at least one audio source signal.

First claim

Opening claim text (preview).

We claim: 1. An apparatus comprising: a depth map estimator circuit configured to determine a depth map of a field of view comprising at least one visual image, the depth map comprising at least one distance value in a direction relative to the apparatus; a direction of arrival circuit configured to determine, using at least two microphones, a direction of arrival of at least one audio source signal within the field of view, wherein the at least one audio source signal is associated with a feature within the at least one visual image, the feature being located at the at least one distance value; and an audio signal circuit configured to process the at least one audio source signal based on the at least one distance value and the direction of arrival of the at least one audio source signal; wherein the processing comprises focusing the feature at the at least one distance value. 2. The apparatus as claimed in claim 1 , wherein the depth map estimator circuit configured to determine at least one of: a depth map from at least two images; a depth map from a depth sensor and at least one image; and a depth map from a lightfield camera. 3. The apparatus as claimed in claim 1 , wherein the direction of arrival circuit comprises: an input configured to receive at least two audio signals from the at least two microphones; an audio source determiner configured to determine based on the at least two audio signals at least one audio source; an audio source direction determiner configured to determine the direction of arrival of the at least one audio source; and a source separator configured to generate based on the at least one audio source and the at least one audio source direction the at least one audio source signal. 4. The apparatus as claimed in claim 1 , wherein the audio signal circuit comprises at least one of the below, configured to process the at least one audio source signal: a filter configured to filter the at least one audio source signal based on the distance value; an amplifier configured to amplify the at least one audio source signal based on the distance value; an attenuator configured to attenuate the at least one audio source signal based on the distance value; a parametric filter configured to parametrically filter the at least one audio source signal based on the distance value; a non-parametric filter configured to non-parametrically filter the at least one audio source signal based on the distance value; a pitch shifter configured to pitch shift the at least one audio source signal based on the distance value; a time varying processor configured to time varying process the at least one audio source signal based on the distance value; a non-linear processor configured to non-linear process the at least one audio source signal based on the distance value; and reverberation processor configured to reverberation process the at least one audio source signal based on the distance value. 5. The apparatus as claimed in claim 1 , wherein the depth map estimator circuit configured to determine a plurality of distance values in directions relative to the apparatus; the direction of arrival determiner configured to determine directions of arrival of a plurality of audio source signals within the field of view; and the audio signal processor configured to process each of the plurality of audio source signals based on the at least one distance value and the direction of arrival. 6. The apparatus as claimed in claim 1 , further comprises an audio synthesiser configured to synthesise a multichannel audio signal from the at least one audio source signal based on the at least one distance value. 7. The apparatus as claimed in claim 6 , further comprises a combiner configured to combine the multichannel audio signal synthesised from each of the processed plurality of audio source signals. 8. The apparatus as claimed in claim 1 , wherein the at least one audio source signal is associated with a feature within the at least one visual image, the feature is located at the at least one distance value, and wherein the apparatus further comprises a visual image processor configured to process the feature. 9. The apparatus as claimed in claim 8 , wherein the processing comprises one of: focusing the feature at the at the at least one distance value, and defocusing for other distance values; or defocusing the feature at the at least one distance value. 10. The apparatus as claimed in claim 1 , wherein the apparatus further comprises: a display configured to display the at least one visual image; and wherein the audio signal processor configured to: receive a selection input from the at least one visual image on the display; and process the at least one audio source signal based on the received selection input. 11. A method comprising: determining, with a depth map estimator circuit, a depth map of a field of view comprising at least one visual image, the depth map comprising at least one distance value in a direction relative to the apparatus; determining, with a direction of arrival circuit, using at least two microphones, a direction of arrival of at least one audio source signal within the field of view, wherein the at least one audio source signal is associated with a feature within the at least one visual image, the feature being located at the at least one distance value; and processing, with an audio signal circuit, the at least one audio source signal based on the at least one distance value and the direction of arrival of the at least one audio source signal; wherein the processing comprises focusing the feature at the at least one distance value. 12. The method as claimed in claim 11 , wherein determining a depth map comprises at least one of: determining a depth map from at least two images offset relative to each other; determining a depth map from a depth sensor and at least one image; and determining a depth map from a lightfield camera. 13. The method as claimed in claim 11 , wherein determining at least one audio source signal with a direction comprises: receiving at least two audio signals from at least two microphones; determining based on the at least two audio signals at least one audio source, and a direction of arrival of the at least one audio source; and generating based on the at least one audio source and the at least one audio source direction the at least one audio source signal with a direction. 14. The method as claimed in claim 11 , wherein processing the at least one audio source signal comprises at least one of the below, configured to process the at least one audio source signal: filtering the at least one audio source signal based on the at least one distance value; amplifying the at least one audio source signal based on the at least one distance value; attenuating the at least one audio source signal based on the at least one distance value; parametrically filtering the at least one audio source signal based on the at least one distance value; non-parametrically filtering the at least one audio source signal based on the at least one distance value; pitch shifting the at least one audio source signal based on the at least one distance value; time varying processing the at least one audio source signal based on the at least one distance value; non-linear processing of the at least one audio source signal based on the at least one distance value; and reverberation processing the at least one audio source signal based on the at least one distance value. 15. The method as claimed in claim 11 , wherein determining, associated

Assignees

Inventors

Classifications

  • from light fields, e.g. from plenoptic cameras · CPC title

  • Microphone arrays; Beamforming · CPC title

  • for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title

  • Positioning of individual sound objects, e.g. moving airplane, within a sound field (H04S2420/13 takes precedence) · CPC title

  • Aspects of volume control, not necessarily automatic, in stereophonic sound systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9769588B2 cover?
An apparatus comprising: a depth map estimator configured to determine, associated with at least one visual image, a depth map comprising at least one distance value in a direction relative to an apparatus; a direction of arrival estimator configured to determine, using at least two microphones, at least one audio source signal with a direction; and an audio signal processor configured to proce…
Who is the assignee on this patent?
Nokia Technologies Oy
What technology area does this patent fall under?
Primary CPC classification H04S7/302. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 19 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).