What technology area does this patent fall under?

Primary CPC classification G06N3/045. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator

US11037050B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11037050-B2
Application number	US-201916458020-A
Country	US
Kind code	B2
Filing date	Jun 29, 2019
Priority date	Jun 29, 2019
Publication date	Jun 15, 2021
Grant date	Jun 15, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and apparatuses relating to arbitration among a plurality of memory interface circuits in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing elements and the RAF circuits. As a dataflow architecture, embodiments of CSA have a unique memory architecture where memory accesses are decoupled into an explicit request and response phase allowing pipelining through memory. Certain embodiments herein provide for improved memory sub-system design via arbitration and the improvements to arbitration discussed herein.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a spatial array of processing elements; a plurality of cache banks each having a plurality of input queues coupled to an input to cache storage; a first plurality of memory interface circuits and a second plurality of memory interface circuits each having an input queue to store data for memory requests from the spatial array of processing elements; a first arbitrator circuit coupled to the input queues of the first plurality of memory interface circuits and to a first input queue of the plurality of input queues of each of the plurality of cache banks, wherein the first arbitrator circuit is to compare a cache bank identification value for a memory request from each of the input queues of the first plurality of memory interface circuits, and issue only one memory request for a plurality of the cache bank identification values that match; and a second arbitrator circuit coupled to the input queues of the second plurality of memory interface circuits and to a second input queue of the plurality of input queues of each of the plurality of cache banks, wherein the second arbitrator circuit is to compare a cache bank identification value for a memory request from each of the input queues of the second plurality of memory interface circuits, and issue only one memory request for a plurality of the cache bank identification values that match. 2. The apparatus of claim 1 , wherein the first arbitrator circuit is to issue only one memory request for a first plurality of the cache bank identification values that match, and concurrently issue only one memory request for a second, different plurality of the cache bank identification values that match. 3. The apparatus of claim 2 , wherein the first arbitrator circuit issues the only one memory request for the first plurality of the cache bank identification values that match according to a first arbitration policy, and concurrently issues the only one memory request for the second, different plurality of the cache bank identification values that match according to a second, different arbitration policy. 4. The apparatus of claim 3 , wherein the first arbitration policy is a round robin arbitration policy, and the second, different arbitration policy is a find first arbitration policy. 5. The apparatus of claim 1 , wherein the first arbitrator circuit comprises a plurality of comparator circuits to compare the cache bank identification value in parallel. 6. The apparatus of claim 1 , wherein the first arbitrator circuit issuing the one memory request for the plurality of the cache bank identification values that match causes a dependency token to be output for that one memory request. 7. The apparatus of claim 1 , wherein the plurality of cache banks comprises an age tracker to ensure memory requests are serviced in order arriving at the first arbitrator circuit and the second arbitrator circuit. 8. The apparatus of claim 1 , further comprising a tile manager circuit coupled to the first plurality of memory interface circuits and the second plurality of memory interface circuits, and a third arbitrator circuit to arbitrate tile manager communications between the tile manager circuit and the first plurality of memory interface circuits and the second plurality of memory interface circuits. 9. A method comprising: sending data for memory requests from a spatial array of processing elements to input queues of a first plurality of memory interface circuits and a second plurality of memory interface circuits; comparing, by a first arbitrator circuit coupled to a plurality of cache banks, a cache bank identification value for a memory request from each of the input queues of the first plurality of memory interface circuits; issuing, by the first arbitrator circuit, only one memory request to a cache bank for a plurality of the cache bank identification values that match; comparing, by a second arbitrator circuit coupled to the plurality of cache banks, a cache bank identification value for a memory request from each of the input queues of the second plurality of memory interface circuits; and issuing, by the second arbitrator circuit, only one memory request to a cache bank for a plurality of the cache bank identification values that match. 10. The method of claim 9 , wherein the issuing, by the first arbitrator circuit, comprises issuing only one memory request for a first plurality of the cache bank identification values that match, and concurrently issuing only one memory request for a second, different plurality of the cache bank identification values that match. 11. The method of claim 10 , wherein the issuing, by the first arbitrator circuit, comprises issuing the only one memory request for the first plurality of the cache bank identification values that match according to a first arbitration policy, and concurrently issuing the only one memory request for the second, different plurality of the cache bank identification values that match according to a second, different arbitration policy. 12. The method of claim 11 , wherein the first arbitration policy is a round robin arbitration policy, and the second, different arbitration policy is a find first arbitration policy. 13. The method of claim 9 , wherein the comparing, by the first arbitrator circuit, comprises performing a plurality of comparisons in parallel with a plurality of comparator circuits of the first arbitrator circuit. 14. The method of claim 9 , further comprising outputting a dependency token for that one memory request when the first arbitrator circuit issues the one memory request for the plurality of the cache bank identification values that match. 15. The method of claim 9 , further comprising ensuring, by an age tracker of the plurality of cache banks, that memory requests are serviced in order arriving at the first arbitrator circuit and the second arbitrator circuit. 16. The method of claim 9 , further comprising coupling a tile manager circuit to the first plurality of memory interface circuits and the second plurality of memory interface circuits; and arbitrating tile manager communications between the tile manager circuit and the first plurality of memory interface circuits and the second plurality of memory interface circuits with a third arbitrator circuit. 17. A non-transitory machine readable medium that stores code that when executed by a machine causes the machine to perform a method comprising: sending data for memory requests from a spatial array of processing elements to input queues of a first plurality of memory interface circuits and a second plurality of memory interface circuits; comparing, by a first arbitrator circuit coupled to a plurality of cache banks, a cache bank identification value for a memory request from each of the input queues of the first plurality of memory interface circuits; issuing, by the first arbitrator circuit, only one memory request to a cache bank for a plurality of the cache bank identification values that match; comparing, by a second arbitrator circuit coupled to the plurality of cache banks, a cache bank identification value for a memory request from each of the input queues of the second plurality of memory interface circuits; and issuing, by the second arbitrator circuit, only one memory request to a cache bank for a plurality of the cache bank identification values that match. 18. The non-transitory machine readable medium of claim 17 , wherein the issuing, by the first arbitrator circuit, comprises issuing only one memory request for a first pl

Assignees

Intel Corp

Inventors

Classifications

G06N3/045Primary
Combinations of networks · CPC title
G06F12/0207
with multidimensional access, e.g. row/column, matrix · CPC title
H04L49/90
Buffering arrangements · CPC title
G06F12/0895
of parts of caches, e.g. directory or tag array · CPC title
G06F12/0875
with dedicated cache, e.g. instruction or stack · CPC title

Patent family

Related publications grouped by family.

View patent family 69846250

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11037050B2 cover?: Systems, methods, and apparatuses relating to arbitration among a plurality of memory interface circuits in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator (CSA) includes a plurality of processing elements; a plurality of request address file (RAF) circuits, and a circuit switched interconnect network between the plurality of processing el…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).