Systems and methods for modifying neural networks for binary processing applications
US-2021073650-A1 · Mar 11, 2021 · US
US12423567B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12423567-B2 |
| Application number | US-202117485073-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 24, 2021 |
| Priority date | Sep 24, 2021 |
| Publication date | Sep 23, 2025 |
| Grant date | Sep 23, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system comprises an analog resistive processing unit (RPU) system, and one or more processors. The analog RPU system comprises an array of RPU cells. The one or more processors are configured to: configure the analog RPU system to implement a convolutional neural network comprising a convolutional layer comprising at least one kernel matrix; program the at least one array of RPU cells to store a transformed kernel matrix which is generated by applying a first transformation process to the kernel matrix using a first predefined transformation matrix; and utilize the analog RPU system to perform an analog convolution operation by performing analog matrix-vector multiplication operations using the transformed kernel matrix and input vectors of a transformed data matrix, to thereby generate a transformed convolution output matrix, wherein the transformed data matrix is generated by applying a second transformation process to a data matrix using a second predefined transformation matrix.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: an analog resistive processing unit system comprising at least one array of resistive processing unit cells; and one or more processors configured to: configure the analog resistive processing unit system to implement a convolutional neural network comprising a convolutional layer, wherein the convolutional layer comprises at least one kernel matrix; program the at least one array of resistive processing unit cells to store a transformed kernel matrix, wherein the transformed kernel matrix is generated by applying a first transformation process to the at least one kernel matrix using a first predefined transformation matrix; and utilize the analog resistive processing unit system to perform an analog convolution operation by performing analog matrix-vector multiplication operations using the transformed kernel matrix and input vectors of a transformed data matrix, to thereby generate a transformed convolution output matrix, wherein the transformed data matrix is generated by applying a second transformation process to a data matrix using a second predefined transformation matrix; wherein the one or more processors are configured to generate the transformed data matrix by utilizing the analog resistive processing unit system to perform the second transformation process in an analog domain. 2. The system of claim 1 , wherein the analog convolution operation is performed as part of a model training process to train the convolutional neural network implemented on the analog resistive processing unit system. 3. The system of claim 1 , wherein the analog convolution operation is performed as part of an inference process that is performed using the convolutional neural network implemented on the analog resistive processing unit system. 4. The system of claim 1 , wherein the first and second transformation processes and the analog convolution operation are implemented according to a Winograd filtering function. 5. The system of claim 1 , wherein the one or more processors are configured to compute the transformed kernel matrix in a digital domain by a process which comprises multiplying the first predefined transformation matrix and the at least one kernel matrix to generate an intermediate matrix, and multiplying the intermediate matrix and a transpose of first predefined transformation matrix to thereby generate the transformed kernel matrix. 6. The system of claim 1 , wherein the convolutional layer comprises a plurality of kernel matrices, and wherein the one or more processors are configured to: generate a corresponding transformed kernel matrix for each kernel matrix of the plurality of kernel matrices; and store each transformed kernel matrix in a separately addressable region of the at least one array of resistive processing unit cells. 7. The system of claim 6 , wherein in utilizing the analog resistive processing unit system to perform the analog convolution operation, the one or more processors are configured perform a pipeline parallel process by applying each input vector of the transformed data matrix to a corresponding one of the transformed kernel matrices stored in the separately addressable regions of the at least one array of resistive processing unit cells. 8. The system of claim 1 , wherein in performing the second transformation process in the analog domain, the one or more processors are configured to: program the at least one array of resistive processing unit cells to store the second predefined transformation matrix; and perform analog matrix-vector multiplication operations by inputting vectors of the data matrix to the stored second predefined transformation matrix. 9. The system of claim 1 , wherein in performing the second transformation process in the analog domain, the one or more processors are configured to: program the at least one array of resistive processing unit cells to store the second predefined transformation matrix; program the at least one array of resistive processing unit cells to store a transpose of the second predefined transformation matrix; perform analog matrix-vector multiplication operations by sequentially inputting vectors of the data matrix to the stored second predefined transformation matrix to generate a corresponding sequence of intermediate output vectors; and perform analog matrix-vector multiplication operations by inputting the sequence of intermediate output vectors to the stored transpose of the second predefined transformation matrix to thereby generate plurality of output vectors which are combined to generate the transformed data matrix. 10. The system of claim 1 , wherein the one or more processors are configured to utilize the analog resistive processing unit system to perform an analog inverse transformation process using a third transformation matrix stored in the at least one array of resistive processing unit cells to thereby convert the transformed convolution output matrix to a convolution output matrix in a spatial domain. 11. The system of claim 1 , wherein the one or more processors are configured to utilize the analog resistive processing unit system to generate a transformed error gradient matrix which is backpropagated through layers of the convolutional neural network implemented on the analog resistive processing unit system, wherein the transformed error gradient matrix is utilized to update transformed kernel weight of the transformed kernel matrix stored in the at least one array of resistive processing unit cells. 12. A computer program product, comprising: one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions comprising: program instructions to configure an analog resistive processing unit system to implement a convolutional neural network comprising a convolutional layer, wherein the convolutional layer comprises at least one kernel matrix, wherein the analog resistive processing unit system comprises at least one array of resistive processing unit cells; and program instructions to program the at least one array of resistive processing unit cells to store a transformed kernel matrix, wherein the transformed kernel matrix is generated by applying a first transformation process to the at least one kernel matrix using a first predefined transformation matrix; program instructions to utilize the analog resistive processing unit system to perform an analog convolution operation by performing analog matrix-vector multiplication operations using the transformed kernel matrix and input vectors of a transformed data matrix, to thereby generate a transformed convolution output matrix, wherein the transformed data matrix is generated by applying a second transformation process to a data matrix using a second predefined transformation matrix; and program instructions to generate the transformed data matrix by utilizing the analog resistive processing unit system to perform the second transformation process in an analog domain. 13. The computer program product of claim 12 , wherein the first and second transformation processes and the analog convolution operation are implemented according to a Winograd filtering function. 14. The computer program product of claim 12 , wherein the convolutional layer comprises a plurality of kernel matrices, and further comprising: program instructions to generate a corresponding transformed kernel matrix for each kernel matrix of the plurality of kernel matrices; and program instructions to store each transformed kernel matrix in a separately addressable region o
for multiplication or division {(G06G7/19 and G06G7/24 take precedence; measuring electric power G01R21/00)} · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Analogue means · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.