Systems, methods, and apparatuses for heterogeneous computing

US11416281B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11416281-B2
Application numberUS-201616474978-A
CountryUS
Kind codeB2
Filing dateDec 31, 2016
Priority dateDec 31, 2016
Publication dateAug 16, 2022
Grant dateAug 16, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a plurality of heterogeneous processing elements; a heterogeneous scheduler circuitry to dispatch instructions for execution on one or more of the plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements, wherein the heterogeneous scheduler circuitry is to support multiple code types including three or more of compiled, intrinsics, assembly, libraries, intermediate, and offload. 2. The system of claim 1 , wherein the plurality of heterogeneous processing elements comprises an in-order processor core, an out-of-order processor core, and a packed data processor core. 3. The system of claim 2 , wherein the plurality of heterogeneous processing elements further comprises an accelerator. 4. The system of claim 2 , wherein for a serial program phase the selected type of processing element is an out-of-order core. 5. The system of claim 2 , wherein for a data parallel program phase the selected type of processing element is a processing core to execute single instruction, multiple data (SIMD) instructions. 6. The system of claim 1 , wherein the heterogeneous scheduler circuitry further comprises: a program phase detector to detect a program phase of the code fragment; wherein the plurality of heterogeneous processing elements includes a first processing element having a first microarchitecture and a second processing element having a second microarchitecture different from the first microarchitecture; wherein the program phase is one of a plurality of program phases, including a first phase and a second phase and the dispatch of instructions is based in part on the detected program phase. 7. The system of claim 1 , wherein the heterogeneous scheduler circuitry further comprises: a selector to select a type of processing element of the plurality of processing elements to execute the received code fragment and schedule the code fragment on a processing element of the selected type of processing elements via dispatch. 8. The system of claim 7 , wherein for a data parallel program phase the selected type of processing element is circuitry to support dense arithmetic primitives. 9. The system of claim 7 , wherein for a data parallel program phase the selected type of processing element is an accelerator. 10. The system of claim 7 , wherein a data parallel program phase comprises data elements that are processed simultaneously using a same control flow. 11. The system of claim 7 , wherein for a thread parallel program phase the selected type of processing element is a scalar processing core. 12. The system of claim 7 , wherein a thread parallel program phase comprises data dependent branches that use unique control flows. 13. The system of claim 7 , wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to a user. 14. The system of claim 7 , wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to an operating system. 15. The system of claim 7 , wherein a default selection of a type of processing element of the plurality of heterogeneous processing elements is a latency optimized core. 16. The system of claim 1 , wherein the code fragment is one or more instructions associated with a software thread. 17. The system of claim 16 , wherein for a data parallel program phase the selected type of processing element is a processing core to execute single instruction, multiple data (SIMD) instructions. 18. The system of claim 1 , wherein the heterogeneous scheduler circuitry is to emulate functionality when the selected type of processing element cannot natively handle the code fragment. 19. The system of claim 1 , wherein heterogeneous scheduler circuitry is to emulate functionality when a number of hardware threads available is oversubscribed. 20. The system of claim 1 , wherein the heterogeneous scheduler circuitry is to present a homogeneous multiprocessor programming model to make each thread appear to a programmer as if it is executing on a scalar core. 21. The system of claim 20 , wherein the presented homogeneous multiprocessor programming model is to present an appearance of support for a full instruction set. 22. The system of claim 1 , wherein the plurality of heterogeneous processing elements is to share a memory address space. 23. The system of claim 1 , wherein the heterogeneous scheduler circuitry includes a binary translator. 24. The system of claim 1 , wherein the heterogeneous scheduler circuitry to select a protocol to use on a multi-protocol bus interface for the dispatched instructions. 25. The system of claim 24 , wherein a first protocol supported by a multi-protocol bus interface comprises a memory interface protocol to be used to access a system memory address space. 26. The system of claim 25 , wherein a second protocol supported by the multi-protocol bus interface comprises a cache coherency protocol to maintain coherency between data stored in a local memory of the accelerator and a memory subsystem of a host processor including a host cache hierarchy and a system memory. 27. The system of claim 26 , wherein a third protocol supported by the multi-protocol bus interface comprises a serial link protocol supporting device discovery, register access, configuration, initialization, interrupts, direct memory access, and address translation services. 28. The system of claim 27 , wherein the third protocol comprises the Peripheral Component Interface Express (PCIe) protocol.

Assignees

Inventors

Classifications

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

  • the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

  • Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11416281B2 cover?
Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein …
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 16 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).