Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Deriving a concordant software neural network layer from a quantized firmware neural network layer

US11556764B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11556764-B2
Application number	US-201916290117-A
Country	US
Kind code	B2
Filing date	Mar 1, 2019
Priority date	Mar 1, 2019
Publication date	Jan 17, 2023
Grant date	Jan 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the NNP to generate a first set of results. The method further includes processing the first instructions to automatically generate second instructions configured for use with at least one processor, different from the NNP, such that the second instructions, when executed by the at least one processor to perform matrix multiply operations, generate a second set of results that are concordant with the first set of results.

First claim

Opening claim text (preview).

What is claimed: 1. A method comprising: receiving firmware code corresponding to a neural network layer, wherein a neural network processor is configured to quantize a first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier incorporated within hardware associated with the neural network processor to generate a first set of results; and using concordance conversion code, converting the firmware code corresponding to the neural network layer into concordant software code configured for use with at least one processor, different from the neural network processor, such that the concordant software code, when executed by the at least one processor to perform matrix multiply operations corresponding to the neural network layer, generate a second set of results that are concordant with the first set of results. 2. The method of claim 1 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code further comprises extracting information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 3. The method of claim 1 , wherein the first set of data is represented in a first precision format having a first precision and the set of quantized data is represented in a second precision format having a second precision lower than the first precision. 4. The method of claim 3 , wherein the first precision format comprises floating point format, and wherein the second precision format comprises a precision format selected from one of an integer format, a reduced floating point precision format, or a block floating point format. 5. The method of claim 1 , wherein the set of quantized data comprises a set of quantized training data for use with operations associated with the concordant software code. 6. The method of claim 1 , wherein the first set of data is organized in an N by N matrix form, and wherein N is an integer greater than 1 and N is a native dimension associated with the matrix-vector-multiplier, and wherein the converting the firmware code corresponding to the neural network layer into concordant software code comprises transforming the first set of data from the N by N matrix form to another form suitable for se with the concordant software code. 7. The method of claim 1 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code comprises transforming a form of the first set of data to another form suitable for use with the concordant software code. 8. A system comprising: at least one processor; and a memory comprising: firmware code corresponding to a neural network layer configured to, using a neural network processor having a matrix-vector-multiplier incorporated within hardware associated with the neural network processor and a multi-function unit incorporated within the hardware associated with the neural network processor, quantize a first set of data to generate a first set of quantized data and then: (1) perform matrix operations on the first set of quantized data, using the matrix-vector-multiplier incorporated within the hardware associated with the neural network processor, to generate a first set of output data, (2) quantize the first set of output data to generate a first set of quantized output data, and (3) perform scalar operations, using the multi-function unit incorporated within the hardware associated with the neural network processor, on the first set of quantized output data to generate a second set of output data; and concordance conversion code configured to process the firmware code to generate concordant software code configured for use with the at least one processor, different from the neural network processor, wherein the concordant software code comprises instructions for performing matrix multiply operations and instructions for performing scalar operations to process the neural network layer. 9. The system of claim 8 , wherein the concordance conversion code is further configured to extract information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 10. The system of claim 8 , wherein the first set of data is represented in a first precision format having a first precision, and wherein each of the first set of quantized data and the first set of quantized output data is represented in a second precision format having a second precision lower than the first precision. 11. The system of claim 10 , wherein the first precision format comprises floating point format, and wherein the second precision format comprises a precision format selected from one of an integer format, a reduced floating point precision format, or a block floating point format. 12. The system of claim 8 , wherein the set of quantized data comprises a set of quantized training data for use with operations associated with the concordant software code. 13. The system of claim 8 , wherein the first set of data is organized in an N by N matrix form, and wherein N is an integer greater than 1 and N is a native dimension associated with the matrix-vector-multiplier, and wherein the concordance conversion code further comprises instructions configured to transform the first set of data from the N by N matrix form to another form suitable for use with the concordant software code. 14. The system of claim 8 , wherein the concordance conversion code further comprises instructions configured to transform a form of the first set of data to another form suitable for use with the concordant software code. 15. A non-transitory computer-readable medium comprising code corresponding to a method, the method comprising: receiving firmware code corresponding to a neural network layer, wherein a neural network processor is configured to quantize a first set of the data to generate a set of quantized data and then perform matrix-vector multiply operations on the set of quantized data using a matrix-vector-multiplier ncorporated within hardware associated with the neural network processor to generate a first set of results; and using concordance conversion code, converting the firmware code corresponding to the neural network layer into concordant software code configured for use with at least one processor, different from the neural network processor, such that the concordant software code, when executed by the at least one processor to perform matrix multiply operations corresponding to the neural network layer, generate a second set of results that are concordant with the first set of results. 16. The non-transitory computer-readable medium of claim 15 , wherein the converting the firmware code corresponding to the neural network layer into concordant software code further comprises extracting information concerning dependencies between the matrix-vector multiply operations and operations selected from among a softmax operation, a ReLU operation, or an addition operation. 17. The non-transitory computer-readable medium of claim 15 , wherein the first set of data is represented in a first precision format having a first precision and the set of quantized data is represented in a second precision format having a second precision lower than the first precision. 18. The non-transitory computer-readable medium of claim 17 , wherein the first pr

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/105
Shells for specifying net layout · CPC title
G06N3/048
Activation functions · CPC title
G06F17/16
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
G06N3/08Primary
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 69846568

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11556764B2 cover?: Systems and methods for deriving a concordant software neural network layer are provided. A method includes receiving first instructions configured to, using a neural network processor (NNP), process a first set of data corresponding to a neural network layer, where the NNP is configured to quantize the first set of the data to generate a set of quantized data and then perform matrix-vector mul…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).