What technology area does this patent fall under?

Primary CPC classification G06N3/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Reduction of parameters in fully connected layers of neural networks

US10509996B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10509996-B2
Application number	US-201615258691-A
Country	US
Kind code	B2
Filing date	Sep 7, 2016
Priority date	May 17, 2016
Publication date	Dec 17, 2019
Grant date	Dec 17, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure is drawn to the reduction of parameters in fully connected layers of neural networks. For a layer whose output is defined by y=Wx, where y is the output vector, x is the input vector, and W is a matrix of connection parameters, vectors uij and vij are defined and submatrices Wi,j are computed as the outer product of uij and vij, so that Wi,j=vij⊗uij, and W is obtained by appending submatrices Wi,j.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for reducing a number of learnable parameters in a fully connected layer of a neural network, the fully connected layer comprising n inputs and m outputs, the method comprising: defining an n-dimensional input vector x representative of n inputs of the layer of the neural network and defining an m-dimensional output vector y representative of the m outputs of the layer; selecting a divisor s of m and a divisor t of n; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; learning a vector u ij comprising t learnable parameters and a vector v ij comprising s learnable parameters for i=(1, . . . , m/s) and j=(1, . . . , n/t) during a training phase of the neural network; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij . 2. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j ) for i=(1, . . . . m/s); and appending all subvectors y, to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ym /s]. 3. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: appending submatrices W ij for i=(1, . . . , m/s) and j=(1, . . . , n/t) to obtain matrix W; and computing y=W·x. 4. The method of claim 1 , further comprising storing the vectors v ij and u ij . 5. The method of claim 1 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . ,m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 6. A system comprising: a processing unit; and a non-transitory memory communicatively coupled to the processing unit and comprising computer-readable program instructions executable by the processing unit for reducing a number of learnable parameters in a fully connected layer of a neural network comprising n inputs and m outputs by: defining an n-dimensional input vector x representative of the n inputs of the aver and defining an m-dimensional output vector y representative of the m outputs of the layer; selecting a divisor s of m and a divisor t of n; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; learning a vector u ij comprising t learnable parameters and a vector v ij comprising s learnable parameters for i=(1, . . . , m/s) and j=(1, . . . , n/t) during a training phase of the neural network; computing submatrices W ij , as an outer product of the learned vector u ij and the learned vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y representative of the in outputs of the layer from the input vector x and the submatrices W ij . 7. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j )) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 8. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: appending submatrices W ij for i=(1, . . . , m/s) and j=(1, . . . , n/t) to obtain matrix W; and computing y=W·x. 9. The system of claim 6 , the non-transitory memory further comprising computer-readable program instructions executable by the processing unit for storing the vectors v ij and u ij . 10. The system of claim 6 , wherein computing the output vector y representative of the m outputs of the layer from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 11. A method for implementing a neural network layer comprising n inputs and m outputs, the method comprising: receiving an n-dimensional input vector x representative of the n inputs of the layer of a neural network; computing an m-dimensional output vector y representative of the m outputs of the layer of the neural network by: retrieving from memory a vector v ij comprising s learned parameters and a vector u ij comprising t learned parameters, wherein the vectors u ij and v ij are learned during a training phase of the neural network; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j of length t; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ; and computing the output vector y from the input vector x and the submatrices W ij ; and outputting the output vector y as the m outputs of the neural network layer. 12. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t (W ij x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 13. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: appending submatrices W ij to obtain matrix W; and computing y=W·x. 14. The method of claim 11 , wherein computing the output vector y from the input vector x and the submatrices W ij comprises: computing y i =Σ j=1 n/t v ij (u ij T x j ) for i=(1, . . . , m/s); and appending all subvectors y i to obtain the output vector y as y=[y 1 ,y 2 ,y 3 , . . . ,ym /s ]. 15. A system for implementing a neural network layer comprising n inputs and m outputs, the system comprising: a processing unit; and a non-transitory memory communicatively coupled to the processing unit and comprising computer-readable program instructions executable by the processing unit for: receiving an n-dimensional input vector x representative of the n inputs of the layer of a neural network; computing an m-dimensional output vector y representative of the m outputs of the layer of the neural network by: retrieving from memory a vector v ij comprising s learned parameters and a vector u ij comprising t learned parameters, wherein the learned parameters included in the vectors v ij and u ij are learned during a training phase of the neural network; partitioning the output vector y into equally sized subvectors y i of length s and partitioning the input vector x into equally sized subvectors x j using divisor t; computing submatrices W ij as an outer product of the vector u ij and the vector v ij so that W ij =u ij T ⊗v ij ;and computing the output vector y from the input vector x and the submatrices W ij and outputting the output vector y at the m outputs of the layer of the neural network. 16. The system of claim 15 , wherein computing the output vector y fr

Assignees

Huawei Tech Co Ltd

Inventors

Classifications

G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06N3/08
Learning methods · CPC title
G06N3/0495Primary
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0499
Feedforward networks · CPC title

Patent family

Related publications grouped by family.

View patent family 60325710

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10509996B2 cover?: The present disclosure is drawn to the reduction of parameters in fully connected layers of neural networks. For a layer whose output is defined by y=Wx, where y is the output vector, x is the input vector, and W is a matrix of connection parameters, vectors uij and vij are defined and submatrices Wi,j are computed as the outer product of uij and vij, so that Wi,j=vij⊗uij, and W is obtained by …
Who is the assignee on this patent?: Huawei Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).