End-to-end deep neural network for auditory attention decoding

US11630513B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11630513-B2
Application numberUS-201916720810-A
CountryUS
Kind codeB2
Filing dateDec 19, 2019
Priority dateDec 20, 2018
Publication dateApr 18, 2023
Grant dateApr 18, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one aspect of the present disclosure, method includes: receiving neural data responsive to a listener's auditory attention; receiving an acoustic signal responsive to a plurality of acoustic sources; for each of the plurality of acoustic sources: generating, from the received acoustic signal, audio data comprising one or more features of the acoustic source, forming combined data representative of the neural data and the audio data, and providing the combined data to a classification network configured to calculate a similarity score between the neural data and the acoustic source using one or more similarity metrics; and using the similarity scores calculated for each of the acoustic sources to identify, from the plurality of acoustic sources, an acoustic source associated with the listener's auditory attention.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: receiving neural data responsive to a listener's auditory attention; receiving an acoustic signal responsive to a plurality of acoustic sources; for each of the plurality of acoustic sources: generating, from the received acoustic signal, audio data comprising one or more features of the acoustic source, forming combined data representative of the neural data and the audio data, and providing the combined data to a convolutional deep neural network (DNN) configured to calculate a similarity score between the neural data and the acoustic source using one or more similarity metrics; and using the similarity scores calculated for each of the acoustic sources to identify, from the plurality of acoustic sources, an acoustic source associated with the listener's auditory attention. 2. The method of claim 1 , comprising separating the acoustic signal into a plurality of candidate signals, wherein generating audio data for an acoustic source comprises generating audio data using one of the plurality of candidate signals associated with the acoustic source. 3. The method of claim 1 , comprising: receiving one or more neural signals responsive to brain activity of the listener; and processing the one or more neural signals to generate the neural data. 4. The method of claim 3 , wherein a device worn by the listener receives the one or more neural signals is and a companion device calculates the similarity scores. 5. The method of claim 1 , wherein the neural data comprises multi-channel electroencephalogram (EEG) data. 6. The method of claim 1 , wherein forming the combined data comprises generating a matrix comprising the neural data and the audio data. 7. The method of claim 1 , wherein the convolutional DNN comprises at least two convolutional layers and at least three fully connected layers. 8. The method of claim 1 , wherein using the similarity scores calculated for each of the acoustic sources to identify the acoustic source associated with the listener's auditory attention comprises identifying an acoustic source having the highest similarity score. 9. An apparatus comprising: a neural sensor interface configured to receive one or more neural signals responsive to a listener's auditory attention; an audio input configured to receive an acoustic signal responsive to a plurality of acoustic sources; and a processor configured to: process the one or more neural signals to generate multi-channel neural data; for each of the plurality of acoustic sources: generate, from the received acoustic signal, audio data comprising one or more features of the acoustic source, form combined data representative of the neural data and the audio data, and provide the combined data to a convolutional deep neural network (DNN) configured to calculate a similarity score between the neural data and the acoustic source using one or more similarity metrics; and identify, from the plurality of acoustic sources, an acoustic source associated with the listener's auditory attention based on the calculated similarity scores. 10. The apparatus of claim 9 , wherein the processor is configured to separate the acoustic signal into a plurality of candidate signals, and to generate audio data for an acoustic source using one of the plurality of candidate signals associated with the acoustic source. 11. The apparatus of claim 10 , comprising an audio output, the processor configured to provide, to the audio output, a candidate signal from the plurality of candidate signals corresponding to the acoustic source associated with the listener's auditory attention. 12. The apparatus of claim 9 , wherein the neural sensor interface is configured to receive multi-channel electroencephalogram (EEG) measurements. 13. The apparatus of claim 9 , wherein the processor is configured to form the combined data comprises as a matrix comprising the neural data and the audio data. 14. The apparatus of claim 9 , wherein the convolutional DNN comprises at least two convolutional layers and at least three fully connected layers. 15. The apparatus of claim 9 , comprising a non-volatile memory configured to store the DNN as a trained model. 16. The apparatus of claim 9 , wherein the processor is configured to identify the acoustic source associated with the listener's auditory attention by identifying an acoustic source having the highest similarity score. 17. A non-transitory computer-readable medium storing program instructions that are executable to: receive neural data responsive to a listener's auditory attention; receive an acoustic signal responsive to a plurality of acoustic sources; for each of the plurality of acoustic sources: generate, from the received acoustic signal, audio data comprising one or more features of the acoustic source, form combined data representative of the neural data and the audio data, and provide the combined data to a convolutional deep neural network (DNN) configured to calculate a similarity score between the neural data and the acoustic source using one or more similarity metrics; and use the similarity scores calculated for each of the acoustic sources to identify, from the plurality of acoustic sources, an acoustic source associated with the listener's auditory attention.

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Matching criteria, e.g. proximity measures · CPC title

  • evaluating hearing capacity · CPC title

  • Gesture based interaction, e.g. based on a set of recognized hand gestures (interaction based on gestures traced on a digitiser G06F3/04883) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11630513B2 cover?
In one aspect of the present disclosure, method includes: receiving neural data responsive to a listener's auditory attention; receiving an acoustic signal responsive to a plurality of acoustic sources; for each of the plurality of acoustic sources: generating, from the received acoustic signal, audio data comprising one or more features of the acoustic source, forming combined data representat…
Who is the assignee on this patent?
Massachusetts Inst Technology, Univ Columbia
What technology area does this patent fall under?
Primary CPC classification G06F3/015. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).