Reduction of parameters in fully connected layers of neural networks

US10509996B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10509996-B2
Application numberUS-201615258691-A
CountryUS
Kind codeB2
Filing dateSep 7, 2016
Priority dateMay 17, 2016
Publication dateDec 17, 2019
Grant dateDec 17, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure is drawn to the reduction of parameters in fully connected layers of neural networks. For a layer whose output is defined by y=Wx, where y is the output vector, x is the input vector, and W is a matrix of connection parameters, vectors uij and vij are defined and submatrices Wi,j are computed as the outer product of uij and vij, so that Wi,j=vij⊗uij, and W is obtained by appending submatrices Wi,j.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for reducing a number of learnable parameters in a fully connected layer of a neural network, the fully connected layer comprising n inputs and m outputs, the method comprising: defining an n-dimensional input vector x representative of n inputs of the layer of the neural network and defining an m-dimensional output vector y representative of the m outputs of the layer; selecting a divisor s of m and a divisor t of n; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; learning a vector u ij comprising t learnable parameters and a vector v ij comprising s learnable parameters for i=(1, . . . , m/s) and j=(1, . . . , n/t) during a training phase of the neural network; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij . 2. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j ) for i=(1, . . . . m/s); and appending all subvectors y, to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ym /s]. 3. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: appending submatrices W ij for i=(1, . . . , m/s) and j=(1, . . . , n/t) to obtain matrix W; and computing y=W·x. 4. The method of claim 1 , further comprising storing the vectors v ij and u ij . 5. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . ,m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 6. A system comprising: a processing unit; and a non-transitory memory communicatively coupled to the processing unit and comprising computer-readable program instructions executable by the processing unit for reducing a number of learnable parameters in a fully connected layer of a neural network comprising n inputs and m outputs by: defining an n-dimensional input vector x representative of the n inputs of the aver and defining an m-dimensional output vector y representative of the m outputs of the layer; selecting a divisor s of m and a divisor t of n; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; learning a vector u ij comprising t learnable parameters and a vector v ij comprising s learnable parameters for i=(1, . . . , m/s) and j=(1, . . . , n/t) during a training phase of the neural network; computing submatrices W ij , as an outer product of the learned vector u ij and the learned vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y representative of the in outputs of the layer from the input vector x and the submatrices W ij . 7. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j )) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 8. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: appending submatrices W ij for i=(1, . . . , m/s) and j=(1, . . . , n/t) to obtain matrix W; and computing y=W·x. 9. The system of claim 6 , the non-transitory memory further comprising computer-readable program instructions executable by the processing unit for storing the vectors v ij and u ij . 10. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 11. A method for implementing a neural network layer comprising n inputs and m outputs, the method comprising: receiving an n-dimensional input vector x representative of the n inputs of the layer of a neural network; computing an m-dimensional output vector y representative of the m outputs of the layer of the neural network by: retrieving from memory a vector v ij comprising s learned parameters and a vector u ij comprising t learned parameters, wherein the vectors u ij and v ij are learned during a training phase of the neural network; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y from the input vector x and the submatrices W ij ; and outputting the output vector y as the m outputs of the neural network layer. 12. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 13. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: appending submatrices W ij to obtain matrix W; and computing y=W·x. 14. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 15. A system for implementing a neural network layer comprising n inputs and m outputs, the system comprising: a processing unit; and a non-transitory memory communicatively coupled to the processing unit and comprising computer-readable program instructions executable by the processing unit for: receiving an n-dimensional input vector x representative of the n inputs of the layer of a neural network; computing an m-dimensional output vector y representative of the m outputs of the layer of the neural network by: retrieving from memory a vector v ij comprising s learned parameters and a vector u ij comprising t learned parameters, wherein the learned parameters included in the vectors v ij and u ij are learned during a training phase of the neural network; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j using divisor t; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ;and computing the output vector y from the input vector x and the submatrices W ij and outputting the output vector y at the m outputs of the layer of the neural network. 16. The system of claim 15 , wherein computing the output vector y fr

Assignees

Inventors

Classifications

  • G06N3/04Primary

    Architecture, e.g. interconnection topology · CPC title

  • Learning methods · CPC title

  • G06N3/0495Primary

    Quantised networks; Sparse networks; Compressed networks · CPC title

  • Feedforward networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10509996B2 cover?
The present disclosure is drawn to the reduction of parameters in fully connected layers of neural networks. For a layer whose output is defined by y=Wx, where y is the output vector, x is the input vector, and W is a matrix of connection parameters, vectors uij and vij are defined and submatrices Wi,j are computed as the outer product of uij and vij, so that Wi,j=vij⊗uij, and W is obtained by …
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).