Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F17/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Hardware node with matrix-vector multiply tiles for neural network processing

US10140252B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10140252-B2
Application number	US-201715637608-A
Country	US
Kind code	B2
Filing date	Jun 29, 2017
Priority date	Feb 28, 2017
Publication date	Nov 27, 2018
Grant date	Nov 27, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Hardware and methods for neural network processing are provided. A method in a system comprising a plurality of nodes, where each node comprises a plurality of tiles, is provided. The method includes receiving an N by M matrix of coefficients configured to control a neural network model. The method includes storing a first row and a second row of the N by M matrix of coefficients in a first and a second on-chip memories incorporated within a first and a second of the plurality of tiles. The method includes processing the first row of the coefficients and a first set of input vectors using a first compute unit incorporated within the first of the plurality of tiles. The method includes processing the second row of the coefficients and a second set of input vectors using a second compute unit incorporated within the second of the plurality of tiles.

First claim

Opening claim text (preview).

What is claimed: 1. A method for evaluating a neural network model in a system comprising a plurality of nodes interconnected via a network, wherein each node comprises a plurality of tiles, the method comprising: receiving an N by M matrix of coefficients via an ingress tree, wherein the N by M matrix of coefficients is configured to control the neural network model, wherein N is an integer equal to or greater than 8 and M is an integer equal to or greater than 8; storing a first row of the N by M matrix of coefficients in a first on-chip memory incorporated within a first of the plurality of tiles and storing a second row of the N by M matrix of coefficients in a second on-chip memory incorporated within a second of the plurality of tiles; processing the first row of the N by M matrix of coefficients and a first set of input vectors, received via the ingress tree, using a first compute unit incorporated within the first of the plurality of tiles; and processing the second row of the N by M matrix of coefficients and a second set of input vectors, received via the ingress tree, using a second compute unit incorporated within the second of the plurality of tiles. 2. The method of claim 1 , wherein the processing the first row further comprises performing a first point-wise dot product operation on the first row of the N by M matrix of coefficients and the first set of input vectors. 3. The method of claim 2 further comprising outputting a first set of output values generated by the first point-wise dot product operation via an egress tree coupled to each one of the plurality of tiles. 4. The method of claim 1 , wherein the processing the second row further comprises performing a second point-wise dot product operation on the second row of the N by M matrix of coefficients and the second set of input vectors. 5. The method of claim 4 further comprising outputting a second set of output values generated by the second point-wise dot product operation via an egress tree coupled to each one of the plurality of tiles. 6. The method of claim 1 , wherein the N by M matrix of coefficients comprises weights corresponding to the neural network model. 7. The method of claim 1 , wherein each of the first set of input vectors and the second set of input vectors comprises runtime values of input vectors and past values of input vectors. 8. A hardware node including a plurality of tiles, the hardware node comprising: an ingress tree configured to receive an N by M matrix of coefficients, wherein the N by M matrix of coefficients is configured to control a neural network model, wherein N is an integer equal to or greater than 8 and M is an integer equal to or greater than 8; a first on-chip memory incorporated within a first of the plurality of tiles configured to store a first row of the N by M matrix of coefficients; a second on-chip memory incorporated within a second of the plurality of tiles configured to store a second row of the N by M matrix of coefficients; a first compute unit incorporated within the first of the plurality of tiles configured to process the first row of N by M matrix of coefficients and a first set of input vectors received via the ingress tree; and a second compute unit incorporated within the second of the plurality of tiles configured to process the second row of the N by M matrix of coefficients and a second set of input vectors received via the ingress tree. 9. The hardware node of claim 8 , wherein the first compute unit is further configured to perform a first point-wise dot product operation on the first row of the N by M matrix of coefficients and the first set of input vectors. 10. The hardware node of claim 9 further comprising an egress tree coupled to each one of the plurality of trees and further configured to output a first set of output values generated by the first point-wise dot product operation. 11. The hardware node of claim 8 , wherein the second compute unit is further configured to perform a second point-wise dot product operation on the second row of the N by M matrix of coefficients and the second set of input vectors. 12. The hardware node of claim 11 further comprising an egress tree coupled to each one of the plurality of trees and further configured to output a second set of output values generated by the second point-wise dot product operation. 13. The hardware node of claim 8 , wherein the N by M matrix of coefficients comprises weights corresponding to the neural network model. 14. The hardware node of claim 8 , wherein each of the first set of input vectors and the second set of input vectors comprises both runtime values of input vectors and past values of input vectors. 15. A hardware node including a plurality of tiles, the hardware node comprising: an ingress tree configured to receive an N by M matrix of coefficients, wherein the N by M matrix of coefficients is configured to control a neural network model, wherein N is an integer equal to or greater than 8 and M is an integer equal to or greater than 8, and wherein the ingress tree comprises a first ingress tree register that fans out to a second ingress tree register and a third ingress tree register; a first on-chip memory incorporated within a first of the plurality of tiles configured to store a first row of the N by M matrix of coefficients; a second on-chip memory incorporated within a second of the plurality of tiles configured to store a second row of the N by M matrix of coefficients; a first compute unit incorporated within the first of the plurality of tiles configured to process the first row of N by M matrix of coefficients and a first set of input vectors received via the ingress tree; and a second compute unit incorporated within the second of the plurality of tiles configured to process the second row of the N by M matrix of coefficients and a second set of input vectors received via the ingress tree. 16. The hardware node of claim 15 , wherein the first compute unit is further configured to perform a first point-wise dot product operation on the first row of the N by M matrix of coefficients and the first set of input vectors. 17. The hardware node of claim 15 , wherein the second compute unit is further configured to perform a second point-wise dot product operation on the second row of the N by M matrix of coefficients and the second set of input vectors. 18. The hardware node of claim 17 further comprising an egress tree coupled to each one of the plurality of trees and further configured to output a first set of output values generated by the first point-wise dot product operation. 19. The hardware node of claim 17 further comprising an egress tree coupled to each one of the plurality of trees and further configured to output a second set of output values generated by the second point-wise dot product operation. 20. The hardware node of claim 15 , wherein the N by M matrix of coefficients comprises weights corresponding to the neural network model.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/048
Activation functions · CPC title
G06F9/3867
using instruction pipelines · CPC title
G06N3/08
Learning methods · CPC title

Patent family

Related publications grouped by family.

View patent family 63245365

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10140252B2 cover?: Hardware and methods for neural network processing are provided. A method in a system comprising a plurality of nodes, where each node comprises a plurality of tiles, is provided. The method includes receiving an N by M matrix of coefficients configured to control a neural network model. The method includes storing a first row and a second row of the N by M matrix of coefficients in a first and…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

In memory matrix multiplication and its usage in neural networks

Neural network compute tile

Deep neural network processing on hardware accelerators with stacked memory

Convolutional neural networks on hardware accelerators

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

Convolutional, long short-term memory, fully connected deep neural networks

Frequently asked questions