Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US9767410B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9767410-B1 |
| Application number | US-201514739335-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 15, 2015 |
| Priority date | Oct 3, 2014 |
| Publication date | Sep 19, 2017 |
| Grant date | Sep 19, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This specification describes, among other things, a computer-implemented method. The method can include training a baseline neural network using a first set of training data. For each node in a subset of interconnected nodes in the baseline neural network, a rank-k approximation of a filter for the node can be computed. A subset of nodes in a rank-constrained neural network can then be initialized with the rank-k approximations of the filters from the baseline neural network. The subset of nodes in the rank-constrained neural network can correspond to the subset of nodes in the baseline neural network. After initializing, the rank-constrained neural network can be trained using a second set of training data while maintaining a rank-k filter topology for the subset of nodes in the rank-constrained neural network.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: obtaining a first set of training data for training a baseline neural network, wherein the first set of training data comprises respective representations of audio signals for a first plurality of speech samples; training the baseline neural network using the first set of training data, the baseline neural network comprising a plurality of interconnected nodes, wherein each of the nodes comprises two or more weight values that characterize a filter for the node; for each node in a subset of the interconnected nodes in the baseline neural network, computing a rank-k approximation that characterizes the filter for the node using a reduced number of weight values; obtaining a second set of training data for training a rank-constrained neural network, wherein the second set of training data comprises respective representations of audio signals for a second plurality of speech samples; initializing a subset of nodes in the rank-constrained neural network with the rank-k approximations from the baseline neural network, the subset of nodes in the rank-constrained neural network corresponding to the subset of the interconnected nodes in the baseline neural network; after initializing the subset of nodes in the rank-constrained neural network with the rank-k approximations, training the rank-constrained neural network using the second set of training data while maintaining a rank-k filter topology for the subset of nodes in the rank-constrained neural network; after training the rank-constrained neural network using the second set of training data while maintaining the rank-k filter topology for the subset of nodes in the rank-constrained neural network, providing to the rank-constrained neural network a run-time input that represents a first audio signal for a first speech sample; calculating, using the rank-constrained neural network, respective probabilities of one or more output conditions that correspond to the run-time input; and providing a recognition output for the run-time input that is determined based at least in part on the respective probabilities of the one or more output conditions that correspond to the run-time input. 2. The computer-implemented method of claim 1 , wherein: the subset of the interconnected nodes in the baseline neural network comprises nodes in a first hidden layer of the baseline neural network, to the exclusion of other nodes outside of the first hidden layer of the baseline neural network; and initializing the subset of nodes in the rank-constrained neural network comprises initializing nodes in a first hidden layer of the rank-constrained neural network with the rank-k approximations from the baseline neural network, to the exclusion of other nodes outside the first hidden layer of the rank-constrained neural network. 3. The computer-implemented method of claim 1 , wherein for at least one node in the subset of the interconnected nodes in the base line neural network, computing the rank-k approximation comprises performing a single value decomposition on a weight vector of the rank-k approximation. 4. The computer-implemented method of claim 3 , wherein performing the singular value decomposition on the weight vector of the rank-k approximation comprises wrapping the weight vector into a matrix. 5. The computer-implemented method of claim 1 , wherein the second set of training data includes all or some of the training data from the first set of training data. 6. The computer-implemented method of claim 1 , wherein providing the run-time input to the rank-constrained neural network, calculating the respective probabilities of the one or more output conditions that correspond to the run-time input, and providing the recognition output for the run-time input comprises performing an automatic speech recognition task with the trained rank-constrained neural network by predicting a likelihood that the first speech sample includes one or more words, or portions of a word, in a language. 7. The computer-implemented method of claim 1 , wherein providing the run-time input to the rank-constrained neural network, calculating the respective probabilities of the one or more output conditions that correspond to the run-time input, and providing the recognition output for the run-time input comprises predicting a likelihood that the first speech sample includes one or more pre-defined keywords, or portions of a keyword, in the language. 8. The computer-implemented method of claim 1 , wherein at least one of the first set of training data or the second set of training data comprises samples that each indicate, for each of n time frames, a level of an audio signal for each of m frequency channels during the time frame. 9. The computer-implemented method of claim 1 , wherein the first audio signal for the first speech sample is capable of being visually represented by a two-dimensional spectrogram. 10. The computer-implemented method of claim 1 , wherein the respective representations of audio signals for the first plurality of speech samples and the second plurality of speech samples each includes a plurality of values arranged in an m×n array. 11. The computer-implemented method of claim 10 , wherein: each node in the subset of the interconnected nodes in the baseline neural network is comprised only of a number of weight values that equals the product of m and n, or that equals the product of m and n plus one; and each node in the subset of nodes in the trained rank-constrained neural network is comprised only of a number of weight values that equals the sum of m and n, or that equals the sum of m and n plus one. 12. The computer-implemented method of claim 1 , wherein computing the rank-k approximation comprises computing a rank-1 approximation. 13. The computer-implemented method of claim 1 , further comprising performing image processing tasks with the trained rank-constrained neural network. 14. One or more non-transitory computer-readable storage media having instructions stored thereon that, when executed by one or more processors, cause performance of operations comprising: obtaining a first set of training data for training a baseline neural network, wherein the first set of training data comprises respective representations of audio signals for a first plurality of speech samples; training the baseline neural network using the first set of training data, the baseline neural network comprising a plurality of interconnected nodes, wherein each of the nodes comprises two or more weight values that characterize a filter for the node; for each node in a subset of the interconnected nodes in the baseline neural network, computing a rank-k approximation that characterizes the filter for the node using a reduced number of weight values; obtaining a second set of training data for training a rank-constrained neural network, wherein the second set of training data comprises respective representations of audio signals for a second plurality of speech samples; initializing a subset of nodes in the rank-constrained neural network with the rank-k approximations from the baseline neural network, the subset of nodes in the rank-constrained neural network corresponding to the subset of the interconnected nodes in the baseline neural network; after initializing the subset of nodes in the rank-constrained neural network with the rank-k approximations, training the rank-constrained neural network using the second set of training data while maintaining a rank-k filter topology for the subset of nodes in the rank-constrained neural network; after training the
Quantised networks; Sparse networks; Compressed networks · CPC title
Supervised learning · CPC title
Feedforward networks · CPC title
Learning methods · CPC title
using artificial neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.