Hardware processors and methods for tightly-coupled heterogeneous computing

US2016378715A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016378715-A1
Application numberUS-201514752047-A
CountryUS
Kind codeA1
Filing dateJun 26, 2015
Priority dateJun 26, 2015
Publication dateDec 29, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatuses relating to tightly-coupled heterogeneous computing are described. In one embodiment, a hardware processor includes a plurality of execution units in parallel, a switch to connect inputs of the plurality of execution units to outputs of a first buffer and a plurality of memory banks and connect inputs of the plurality of memory banks and a plurality of second buffers in parallel to outputs of the first buffer, the plurality of memory banks, and the plurality of execution units, and an offload engine with inputs connected to outputs of the plurality of second buffers.

First claim

Opening claim text (preview).

What is claimed is: 1 . A hardware processor comprising: a plurality of execution units in parallel; a switch to connect inputs of the plurality of execution units to outputs of a first buffer and a plurality of memory banks and connect inputs of the plurality of memory banks and a plurality of second buffers in parallel to outputs of the first buffer, the plurality of memory banks, and the plurality of execution units; and an offload engine with inputs connected to outputs of the plurality of second buffers. 2 . The hardware processor of claim 1 , wherein an output of the offload engine connects to an input of the first buffer. 3 . The hardware processor of claim 1 , further comprising data hazard resolution logic to simultaneously read from the output of the first buffer and write to the inputs of the plurality of second buffers. 4 . The hardware processor of claim 3 , wherein the data hazard resolution logic is to not insert a stall. 5 . The hardware processor of claim 1 , wherein the plurality of execution units are to execute at a first clock speed and the offload engine is to execute at a second, slower clock speed. 6 . The hardware processor of claim 1 , wherein the plurality of execution units each includes a shift register. 7 . The hardware processor of claim 1 , wherein the first buffer and the plurality of second buffers are first in first out (FIFO) buffers. 8 . The hardware processor of claim 1 , wherein the plurality of memory banks are four or more memory banks and each memory bank includes an input port and an output port separate from input ports and output ports of the other memory banks. 9 . A method comprising: connecting inputs of a plurality of execution units in parallel of a hardware processor to outputs of a first buffer and a plurality of memory banks and connecting inputs of the plurality of memory banks and a plurality of second buffers in parallel to outputs of the first buffer, the plurality of memory banks, and the plurality of execution units with a switch based on a control signal; and providing data to inputs of an offload engine from outputs of the plurality of second buffers. 10 . The method of claim 9 , further comprising providing data from an output of the offload engine to an input of the first buffer. 11 . The method of claim 9 , further comprising simultaneously reading from the output of the first buffer and writing to the inputs of the plurality of second buffers. 12 . The method of claim 11 , further comprising not inserting a stall. 13 . The method of claim 9 , further comprising the plurality of execution units executing at a first clock speed and the offload engine executing at a second, slower clock speed. 14 . The method of claim 9 , wherein the plurality of execution units each includes a shift register. 15 . The method of claim 9 , wherein the plurality of memory banks are four or more memory banks and each memory bank includes an input port and an output port separate from input ports and output ports of the other memory banks. 16 . The method of claim 9 , wherein the first buffer and the plurality of second buffers are first in first out (FIFO) buffers. 17 . A hardware processor comprising: a hardware decoder to decode an instruction; a hardware execution unit to execute the instruction to: connect inputs of a plurality of execution units in parallel of the hardware processor to outputs of a first buffer and a plurality of memory banks and connecting inputs of the plurality of memory banks and a plurality of second buffers in parallel to outputs of the first buffer, the plurality of memory banks, and the plurality of execution units based on a control signal; and provide data to inputs of an offload engine from outputs of the plurality of second buffers. 18 . The hardware processor of claim 17 , wherein an output of the offload engine connects to an input of the first buffer. 19 . The hardware processor of claim 17 , wherein the hardware execution unit is to execute the instruction to cause a simultaneous read from the output of the first buffer and write to the inputs of the plurality of second buffers. 20 . The hardware processor of claim 19 , wherein the hardware execution unit is to execute the instruction without inserting a stall. 21 . The hardware processor of claim 17 , wherein the plurality of execution units are to execute at a first clock speed and the offload engine is to execute at a second, slower clock speed. 22 . The hardware processor of claim 17 , wherein the plurality of execution units each includes a shift register. 23 . The hardware processor of claim 17 , wherein the first buffer and the plurality of second buffers are first in first out (FIFO) buffers. 24 . The hardware processor of claim 17 , wherein the plurality of memory banks are four or more memory banks and each memory bank includes an input port and an output port separate from input ports and output ports of the other memory banks.

Assignees

Inventors

Classifications

  • Details on data memory access · CPC title

  • G06F13/16Primary

    for access to memory bus (G06F13/28 takes precedence) · CPC title

  • where the synchronisation uses buffers, e.g. for speed matching between buses · CPC title

  • Dependency mechanisms, e.g. register scoreboarding · CPC title

  • Special arrangements thereof, e.g. mask or switch · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016378715A1 cover?
Methods and apparatuses relating to tightly-coupled heterogeneous computing are described. In one embodiment, a hardware processor includes a plurality of execution units in parallel, a switch to connect inputs of the plurality of execution units to outputs of a first buffer and a plurality of memory banks and connect inputs of the plurality of memory banks and a plurality of second buffers in …
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F15/8061. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).