Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US11556764B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11556764-B2 |
| Application number | US-201916290117-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 1, 2019 |
| Priority date | Mar 1, 2019 |
| Publication date | Jan 17, 2023 |
| Grant date | Jan 17, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.
Opening claim text (preview).
What is claimed: 1. A method comprising: receiving firmware code corresponding to a neural network layer, wherein a neural network processor is configured to quantize a first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the neural network processor to generate a first set of results; and using concordance conversion code, converting the firmware code corresponding to the neural network layer into concordant software code configured for use with at least one processor, different from the neural network processor, such that the concordant software code, when executed by the at least one processor to perform matrix multiply operations corresponding to the neural network layer, generate a second set of results that are concordant with the first set of results. 2. The method of claim 1 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code further comprises extracting information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 3. The method of claim 1 , wherein the first set of data is represented in a first precision format having a first precision and the set of quantized data is represented in a second precision format having a second precision lower than the first precision. 4. The method of claim 3 , wherein the first precision format comprises floating point format, and wherein the second precision format comprises a precision format selected from one of an integer format, a reduced floating point precision format, or a block floating point format. 5. The method of claim 1 , wherein the set of quantized data comprises a set of quantized training data for use with operations associated with the concordant software code. 6. The method of claim 1 , wherein the first set of data is organized in an N by N matrix form, and wherein N is an integer greater than 1 and N is a native dimension associated with the matrix-vector-multiplier, and wherein the converting the firmware code corresponding to the neural network layer into concordant software code comprises transforming the first set of data from the N by N matrix form to another form suitable for se with the concordant software code. 7. The method of claim 1 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code comprises transforming a form of the first set of data to another form suitable for use with the concordant software code. 8. A system comprising: at least one processor; and a memory comprising: firmware code corresponding to a neural network layer configured to, using a neural network processor having a matrix-vector-multiplier incorporated within hardware associated with the neural network processor and a multi-function unit incorporated within the hardware associated with the neural network processor, quantize a first set of data to generate a first set of quantized data and then: (1) perform matrix operations on the first set of quantized data, using the matrix-vector-multiplier incorporated within the hardware associated with the neural network processor, to generate a first set of output data, (2) quantize the first set of output data to generate a first set of quantized output data, and (3) perform scalar operations, using the multi-function unit incorporated within the hardware associated with the neural network processor, on the first set of quantized output data to generate a second set of output data; and concordance conversion code configured to process the firmware code to generate concordant software code configured for use with the at least one processor, different from the neural network processor, wherein the concordant software code comprises instructions for performing matrix multiply operations and instructions for performing scalar operations to process the neural network layer. 9. The system of claim 8 , wherein the concordance conversion code is further configured to extract information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 10. The system of claim 8 , wherein the first set of data is represented in a first precision format having a first precision, and wherein each of the first set of quantized data and the first set of quantized output data is represented in a second precision format having a second precision lower than the first precision. 11. The system of claim 10 , wherein the first precision format comprises floating point format, and wherein the second precision format comprises a precision format selected from one of an integer format, a reduced floating point precision format, or a block floating point format. 12. The system of claim 8 , wherein the set of quantized data comprises a set of quantized training data for use with operations associated with the concordant software code. 13. The system of claim 8 , wherein the first set of data is organized in an N by N matrix form, and wherein N is an integer greater than 1 and N is a native dimension associated with the matrix-vector-multiplier, and wherein the concordance conversion code further comprises instructions configured to transform the first set of data from the N by N matrix form to another form suitable for use with the concordant software code. 14. The system of claim 8 , wherein the concordance conversion code further comprises instructions configured to transform a form of the first set of data to another form suitable for use with the concordant software code. 15. A non-transitory computer-readable medium comprising code corresponding to a method, the method comprising: receiving firmware code corresponding to a neural network layer, wherein a neural network processor is configured to quantize a first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier ncorporated within hardware associated with the neural network processor to generate a first set of results; and using concordance conversion code, converting the firmware code corresponding to the neural network layer into concordant software code configured for use with at least one processor, different from the neural network processor, such that the concordant software code, when executed by the at least one processor to perform matrix multiply operations corresponding to the neural network layer, generate a second set of results that are concordant with the first set of results. 16. The non-transitory computer-readable medium of claim 15 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code further comprises extracting information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 17. The non-transitory computer-readable medium of claim 15 , wherein the first set of data is represented in a first precision format having a first precision and the set of quantized data is represented in a second precision format having a second precision lower than the first precision. 18. The non-transitory computer-readable medium of claim 17 , wherein the first pr
Recurrent networks, e.g. Hopfield networks · CPC title
Shells for specifying net layout · CPC title
Activation functions · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.