Enhanced application request based scheduling on heterogeneous elements of information technology infrastructure
US-10042673-B1 · Aug 7, 2018 · US
US11416281B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11416281-B2 |
| Application number | US-201616474978-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2016 |
| Priority date | Dec 31, 2016 |
| Publication date | Aug 16, 2022 |
| Grant date | Aug 16, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a plurality of heterogeneous processing elements; a heterogeneous scheduler circuitry to dispatch instructions for execution on one or more of the plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements, wherein the heterogeneous scheduler circuitry is to support multiple code types including three or more of compiled, intrinsics, assembly, libraries, intermediate, and offload. 2. The system of claim 1 , wherein the plurality of heterogeneous processing elements comprises an in-order processor core, an out-of-order processor core, and a packed data processor core. 3. The system of claim 2 , wherein the plurality of heterogeneous processing elements further comprises an accelerator. 4. The system of claim 2 , wherein for a serial program phase the selected type of processing element is an out-of-order core. 5. The system of claim 2 , wherein for a data parallel program phase the selected type of processing element is a processing core to execute single instruction, multiple data (SIMD) instructions. 6. The system of claim 1 , wherein the heterogeneous scheduler circuitry further comprises: a program phase detector to detect a program phase of the code fragment; wherein the plurality of heterogeneous processing elements includes a first processing element having a first microarchitecture and a second processing element having a second microarchitecture different from the first microarchitecture; wherein the program phase is one of a plurality of program phases, including a first phase and a second phase and the dispatch of instructions is based in part on the detected program phase. 7. The system of claim 1 , wherein the heterogeneous scheduler circuitry further comprises: a selector to select a type of processing element of the plurality of processing elements to execute the received code fragment and schedule the code fragment on a processing element of the selected type of processing elements via dispatch. 8. The system of claim 7 , wherein for a data parallel program phase the selected type of processing element is circuitry to support dense arithmetic primitives. 9. The system of claim 7 , wherein for a data parallel program phase the selected type of processing element is an accelerator. 10. The system of claim 7 , wherein a data parallel program phase comprises data elements that are processed simultaneously using a same control flow. 11. The system of claim 7 , wherein for a thread parallel program phase the selected type of processing element is a scalar processing core. 12. The system of claim 7 , wherein a thread parallel program phase comprises data dependent branches that use unique control flows. 13. The system of claim 7 , wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to a user. 14. The system of claim 7 , wherein the selection of a type of processing element of the plurality of heterogeneous processing elements is transparent to an operating system. 15. The system of claim 7 , wherein a default selection of a type of processing element of the plurality of heterogeneous processing elements is a latency optimized core. 16. The system of claim 1 , wherein the code fragment is one or more instructions associated with a software thread. 17. The system of claim 16 , wherein for a data parallel program phase the selected type of processing element is a processing core to execute single instruction, multiple data (SIMD) instructions. 18. The system of claim 1 , wherein the heterogeneous scheduler circuitry is to emulate functionality when the selected type of processing element cannot natively handle the code fragment. 19. The system of claim 1 , wherein heterogeneous scheduler circuitry is to emulate functionality when a number of hardware threads available is oversubscribed. 20. The system of claim 1 , wherein the heterogeneous scheduler circuitry is to present a homogeneous multiprocessor programming model to make each thread appear to a programmer as if it is executing on a scalar core. 21. The system of claim 20 , wherein the presented homogeneous multiprocessor programming model is to present an appearance of support for a full instruction set. 22. The system of claim 1 , wherein the plurality of heterogeneous processing elements is to share a memory address space. 23. The system of claim 1 , wherein the heterogeneous scheduler circuitry includes a binary translator. 24. The system of claim 1 , wherein the heterogeneous scheduler circuitry to select a protocol to use on a multi-protocol bus interface for the dispatched instructions. 25. The system of claim 24 , wherein a first protocol supported by a multi-protocol bus interface comprises a memory interface protocol to be used to access a system memory address space. 26. The system of claim 25 , wherein a second protocol supported by the multi-protocol bus interface comprises a cache coherency protocol to maintain coherency between data stored in a local memory of the accelerator and a memory subsystem of a host processor including a host cache hierarchy and a system memory. 27. The system of claim 26 , wherein a third protocol supported by the multi-protocol bus interface comprises a serial link protocol supporting device discovery, register access, configuration, initialization, interrupts, direct memory access, and address translation services. 28. The system of claim 27 , wherein the third protocol comprises the Peripheral Component Interface Express (PCIe) protocol.
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.