What technology area does this patent fall under?

Primary CPC classification G06N3/044. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Non-negative matrix factorization regularized by recurrent neural networks for audio processing

US9721202B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9721202-B2
Application number	US-201414186832-A
Country	US
Kind code	B2
Filing date	Feb 21, 2014
Priority date	Feb 21, 2014
Publication date	Aug 1, 2017
Grant date	Aug 1, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.

First claim

Opening claim text (preview).

What is claimed is: 1. A method implemented by at least one computing device, the method comprising: capturing, by the at least one computing device, temporal dependencies in sound data modeled through use of a recurrent neural network (RNN); extracting, by the at least one computing device, features from the sound data based on the captured temporal dependencies as a negative log-likelihood term of a nonnegative matrix factorization (NMF) cost using nonnegative matrix factorization (NMF); and performing, by the at least one computing device, one or more sound processing techniques on the sound data based at least in part on the extracted features. 2. A method as described in claim 1 , wherein the recurrent neural network models the temporal dependencies in a temporal sequence of frames in the sound data. 3. A method as described in claim 2 , wherein the frames are configured as part of a magnitude spectrogram. 4. A method as described in claim 2 , wherein the frames are configured as vectors in an activity matrix. 5. A method as described in claim 2 , wherein the recurrent neural network captures long-term temporal dependencies and event co-occurrence in the sound data. 6. A method as described in claim 5 , wherein the long-term temporal dependencies describe a plurality of frames in the temporal sequence that includes a frame, a preceding frame, and at least one other frame. 7. A method as described in claim 1 , wherein the RNN is employed as part of one or more time-dependent restricted Boltzmann machines (RBM) to describe multimodal conditional densities in the sound data. 8. A method as described in claim 1 , wherein the recurrent neural network (RNN) is configured to capture the temporal dependencies by discovering an approximate factorization of an input matrix that describes an observed magnitude spectrogram of the sound data having time and frequency dimensions. 9. A method as described in claim 1 , wherein the NMF is configured to utilize a Cosine distance as a cost distance in nonnegative matrix factorization to generate a likelihood that a respective sound source generated a respective portion of the sound data. 10. A method as described in claim 1 , wherein the temporal information obtained using the recurrent neural network (RNN) is used as part of nonnegative matrix factorization (NMF) to predict plausibility of decomposition of sound data as part of the feature extraction. 11. A method as described in claim 10 , wherein the predicted plausibility of decomposition of the sound data is predicted as a density of activity matrices used as part of nonnegative matrix factorization (NMF). 12. A system comprising: at least one computing device having a processor and memory configured to perform operations comprising: capturing temporal dependencies in sound data modeled through use of a recurrent neural network (RNN); extracting features from the sound data based on the captured temporal dependencies as a negative log-likelihood term of a nonnegative matrix factorization (NMF) cost using nonnegative matrix factorization (NMF); and performing one or more sound processing techniques on the sound data based at least in part on the extracted features. 13. A system as described in claim 12 , wherein the recurrent neural network captures long-term temporal dependencies and event co-occurrence in the sound data. 14. A system as described in claim 13 , wherein the long-term temporal dependencies describe a plurality of frames in the temporal sequence that includes the frame, the preceding frame, and the at least one other frame. 15. A system as described in claim 12 , wherein the recurrent neural network (RNN) is configured to capture the temporal dependencies by discovering an approximate factorization of an input matrix that describes an observed magnitude spectrogram of the sound data having time and frequency dimensions. 16. A system as described in claim 12 , wherein the NMF is configured to utilize a Cosine distance as a cost distance in nonnegative matrix factorization to generate a likelihood that a respective sound source generated a respective portion of the sound data. 17. A system comprising: means for capturing temporal dependencies in sound data modeled through use of a recurrent neural network (RNN); means for extracting features from the sound data based on the captured temporal dependencies as a negative log-likelihood term of a nonnegative matrix factorization (NMF) cost using nonnegative matrix factorization (NMF); and means for performing one or more sound processing techniques on the sound data based at least in part on the extracted features. 18. A system as described in claim 17 , wherein the recurrent neural network models the temporal dependencies in a temporal sequence of frames in the sound data. 19. A system as described in claim 18 , wherein the frames are configured as part of a magnitude spectrogram. 20. A system as described in claim 18 , wherein the frames are configured as vectors in an activity matrix. 21. A system as described in claim 18 , wherein the recurrent neural network captures long-term temporal dependencies and event co-occurrence in the sound data.

Assignees

Adobe Systems Inc

Inventors

Classifications

G10L21/0272
Voice signal separating · CPC title
G10L21/0308
characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques · CPC title
G06N3/044Primary
Recurrent networks, e.g. Hopfield networks · CPC title
G10L21/028
using properties of sound source · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 53882262

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9721202B2 cover?: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques a…
Who is the assignee on this patent?: Adobe Systems Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/044. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 01 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).