Processors, methods, and systems for a memory fence in a configurable spatial accelerator

US10496574B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10496574-B2
Application numberUS-201715719285-A
CountryUS
Kind codeB2
Filing dateSep 28, 2017
Priority dateSep 28, 2017
Publication dateDec 3, 2019
Grant dateDec 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and apparatuses relating to a memory fence mechanism in a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. The processor also includes a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a plurality of processing elements; an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as one of a plurality of dataflow operators in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations; a plurality of request address file (RAF) circuits including a first RAF circuit to request the memory fence by sending a fence-request message to the fence manager; a plurality of cache banks; and an accelerator cache interconnect (ACI) to connect the plurality of RAF circuits to the plurality of cache banks; wherein, in response to the fence-request message, the fence manager is to send a first fence-open message to the plurality of RAF circuits to open the fence operation; in response to the first fence-open message, each of the plurality of RAF circuits is to complete outstanding memory operations and send a first fence-acknowledge message to the fence manager; in response to the first fence-acknowledge message from each of the plurality of RAF circuits, the fence manager is to send a second fence-open message to each of the plurality of cache banks; in response to the second fence-open message, each of the plurality of cache banks is to complete outstanding memory operations and send a second fence-acknowledge message to the fence manager; and in response to the second fence-acknowledge message from each of the plurality of cache banks, the fence manager is to send a fence-close message to each of a plurality of RAF circuits. 2. The processor of claim 1 , wherein the ACI is to carry the fence-open message from the fence manager to the plurality of RAF circuits and the plurality of cache banks. 3. The processor of claim 1 , wherein the fence manager includes a state machine. 4. The processor of claim 1 , wherein the fence manager is also to buffer fence requests. 5. A system comprising: a system memory; and a processor including: a plurality of processing elements; an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as one of a plurality of dataflow operators in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations, wherein the first operation and the second operation are to access the system memory; a plurality of request address file (RAF) circuits including a first RAF circuit to request the memory fence by sending a fence-request message to the fence manager; a plurality of cache banks; and an accelerator cache interconnect (ACI) to connect the plurality of RAF circuits to the plurality of cache banks; wherein, in response to the fence-request message, the fence manager is to send a first fence-open message to the plurality of RAF circuits to open the fence operation; in response to the first fence-open message, each of the plurality of RAF circuits is to complete outstanding memory operations and send a first fence-acknowledge message to the fence manager; in response to the first fence-acknowledge message from each of the plurality of RAF circuits, the fence manager is to send a second fence-open message to each of the plurality of cache banks; in response to the second fence-open message, each of the plurality of cache banks is to retire outstanding memory operations to the system memory and send a second fence-acknowledge message to the fence manager; and in response to the second fence-acknowledge message from each of the plurality of cache banks, the fence manager is to send a fence-close message to each of a plurality of RAF circuits.

Assignees

Inventors

Classifications

  • with a network or matrix configuration · CPC title

  • of parts of caches, e.g. directory or tag array · CPC title

  • G06F13/28Primary

    using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title

  • Plural cache memories · CPC title

  • Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10496574B2 cover?
Systems, methods, and apparatuses relating to a memory fence mechanism in a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into th…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F13/28. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).