Electronic device and method for converting source code into machine code
US-2015317134-A1 · Nov 5, 2015 · US
US10496574B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10496574-B2 |
| Application number | US-201715719285-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2017 |
| Priority date | Sep 28, 2017 |
| Publication date | Dec 3, 2019 |
| Grant date | Dec 3, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and apparatuses relating to a memory fence mechanism in a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. The processor also includes a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a plurality of processing elements; an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as one of a plurality of dataflow operators in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations; a plurality of request address file (RAF) circuits including a first RAF circuit to request the memory fence by sending a fence-request message to the fence manager; a plurality of cache banks; and an accelerator cache interconnect (ACI) to connect the plurality of RAF circuits to the plurality of cache banks; wherein, in response to the fence-request message, the fence manager is to send a first fence-open message to the plurality of RAF circuits to open the fence operation; in response to the first fence-open message, each of the plurality of RAF circuits is to complete outstanding memory operations and send a first fence-acknowledge message to the fence manager; in response to the first fence-acknowledge message from each of the plurality of RAF circuits, the fence manager is to send a second fence-open message to each of the plurality of cache banks; in response to the second fence-open message, each of the plurality of cache banks is to complete outstanding memory operations and send a second fence-acknowledge message to the fence manager; and in response to the second fence-acknowledge message from each of the plurality of cache banks, the fence manager is to send a fence-close message to each of a plurality of RAF circuits. 2. The processor of claim 1 , wherein the ACI is to carry the fence-open message from the fence manager to the plurality of RAF circuits and the plurality of cache banks. 3. The processor of claim 1 , wherein the fence manager includes a state machine. 4. The processor of claim 1 , wherein the fence manager is also to buffer fence requests. 5. A system comprising: a system memory; and a processor including: a plurality of processing elements; an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as one of a plurality of dataflow operators in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements; a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations, wherein the first operation and the second operation are to access the system memory; a plurality of request address file (RAF) circuits including a first RAF circuit to request the memory fence by sending a fence-request message to the fence manager; a plurality of cache banks; and an accelerator cache interconnect (ACI) to connect the plurality of RAF circuits to the plurality of cache banks; wherein, in response to the fence-request message, the fence manager is to send a first fence-open message to the plurality of RAF circuits to open the fence operation; in response to the first fence-open message, each of the plurality of RAF circuits is to complete outstanding memory operations and send a first fence-acknowledge message to the fence manager; in response to the first fence-acknowledge message from each of the plurality of RAF circuits, the fence manager is to send a second fence-open message to each of the plurality of cache banks; in response to the second fence-open message, each of the plurality of cache banks is to retire outstanding memory operations to the system memory and send a second fence-acknowledge message to the fence manager; and in response to the second fence-acknowledge message from each of the plurality of cache banks, the fence manager is to send a fence-close message to each of a plurality of RAF circuits.
with a network or matrix configuration · CPC title
of parts of caches, e.g. directory or tag array · CPC title
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
Plural cache memories · CPC title
Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.