Methods and apparatus for scheduling instructions using pre-decode data

US9798548B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9798548-B2
Application numberUS-201113333879-A
CountryUS
Kind codeB2
Filing dateDec 21, 2011
Priority dateDec 21, 2011
Publication dateOct 24, 2017
Grant dateOct 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for scheduling instructions using pre-decode data corresponding to each instruction. In one embodiment, a multi-core processor includes a scheduling unit in each core for selecting instructions from two or more threads each scheduling cycle for execution on that particular core. As threads are scheduled for execution on the core, instructions from the threads are fetched into a buffer without being decoded. The pre-decode data is determined by a compiler and is extracted by the scheduling unit during runtime and used to control selection of threads for execution. The pre-decode data may specify a number of scheduling cycles to wait before scheduling the instruction. The pre-decode data may also specify a scheduling priority for the instruction. Once the scheduling unit selects an instruction to issue for execution, a decode unit fully decodes the instruction.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for scheduling instructions within a parallel computing machine, the method comprising: fetching instructions corresponding to two or more thread groups from an instruction cache unit; receiving pre-decode data encoded in each one of the instructions, wherein the pre-decode data is determined when the instructions are compiled; partially decoding a first instruction to decode only the pre-decode data in the first instruction; selecting, at runtime, the first instruction to issue for execution by a parallel processing unit based at least in part on the pre-decode data, the pre-decode data comprising information utilized for scheduling of the execution of the first instruction relative to execution of the other instructions; completing the decoding of the first instruction; and dispatching the first instruction to the parallel processing unit for execution. 2. The method of claim 1 , wherein the pre-decode data encodes a wait scheduling hint comprising a number of scheduling cycles that transpire before the first instruction is issued for execution. 3. The method of claim 2 , wherein the wait scheduling hint specifies a scheduling priority option that changes the scheduling priority for a first thread group of the two or more thread groups that is associated with the first instruction. 4. The method of claim 1 , wherein the pre-decode data specifies that a default scheduling hint is used to schedule the first instruction. 5. The method of claim 1 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select the first instruction to issue over an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 6. The method of claim 1 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select to issue, over the first instruction, an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 7. The method of claim 1 , wherein the pre-decode data encodes a pair scheduling hint that configures a scheduling unit to select to issue the first instruction and a second instruction in a single scheduling cycle, and wherein the first instruction and the second instruction are associated with a first thread group of the two or more thread groups. 8. A scheduling unit, comprising: an instruction cache fetch unit that is configured to route instructions corresponding to two or more thread groups to a first buffer and route pre-decode data associated with each one of the instructions to a second buffer; a macro-scheduler unit that is coupled to the instruction cache fetch unit and configured to receive pre-decode data, wherein the pre-decode data is determined when the instructions are compiled; a micro-scheduler arbiter that is coupled to the macro-scheduler unit and the second buffer and configured to select, at runtime, a first instruction for execution by a processing unit based at least in part on the pre-decode data, the pre-decode data comprising information utilized for scheduling the execution of the first instruction relative to execution of the other instructions; a decode unit coupled to the first buffer and configured to decode the first instruction by partially decoding the first instruction to decode only the pre-decode data in the first instruction, and subsequently completing the decoding of the first instruction; and a dispatch unit coupled to the decode unit and configured to dispatch the first instruction to a processing unit for execution. 9. The scheduling unit of claim 8 , wherein the pre-decode data encodes a wait scheduling hint comprising a number of scheduling cycles that transpire before the first instruction is issued for execution. 10. The scheduling unit of claim 9 , wherein the wait scheduling hint specifies a scheduling priority option that changes the scheduling priority for a first thread group of the two or more thread groups that is associated with the first instruction. 11. The scheduling unit of claim 8 , wherein the pre-decode data specifies that a default scheduling hint is used to schedule the first instruction. 12. The scheduling unit of claim 8 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select the first instruction to issue over an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 13. The scheduling unit of claim 8 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select to issue, over the first instruction, an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 14. A computing device comprising: a parallel processing unit that includes a scheduling unit configured to: fetch instructions corresponding to two or more thread groups from an instruction cache unit; receive pre-decode data encoded in each one of the instructions, where the pre-decode data is determined when the instructions are compiled; partially decode a first instruction to decode only the pre-decode data in the first instruction; select, at runtime, the first instruction for execution by a processing unit based at least in part on the pre-decode data, the pre-decode data comprising information utilized for scheduling of the execution of the first instruction relative to execution of the other instructions; complete the decoding of the first instruction; and dispatch the instruction to the parallel processing unit for execution. 15. The computing device of claim 14 , wherein the pre-decode data encodes a wait scheduling hint comprising a number of scheduling cycles that transpire before the first instruction is issued for execution. 16. The computing device of claim 15 , wherein the wait scheduling hint specifies a scheduling priority option that changes the scheduling priority for a first thread group of the two or more thread groups that are associated with the first instruction. 17. The computing device of claim 14 , wherein the pre-decode data specifies that a default scheduling hint is used to schedule the first instruction. 18. The computing device of claim 14 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select the first instruction to issue over an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 19. The computing device of claim 14 , wherein the pre-decode data encodes a hold scheduling hint that configures a scheduling unit to select to issue, over the first instruction, an earlier issued instruction that failed to execute and is a reissue instruction available to be issued. 20. The computing device of claim 14 , wherein the pre-decode data encodes a pair scheduling hint that configures a scheduling unit to select to issue the first instruction and a second instruction in a single scheduling cycle, and wherein the first instruction and the second instruction are associated with a first thread group of the two or more thread groups.

Assignees

Inventors

Classifications

  • Instruction prefetching · CPC title

  • Pipelined decoding, e.g. using predecoding · CPC title

  • G06F9/3851Primary

    from multiple instruction streams, e.g. multistreaming · CPC title

  • controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9798548B2 cover?
Systems and methods for scheduling instructions using pre-decode data corresponding to each instruction. In one embodiment, a multi-core processor includes a scheduling unit in each core for selecting instructions from two or more threads each scheduling cycle for execution on that particular core. As threads are scheduled for execution on the core, instructions from the threads are fetched int…
Who is the assignee on this patent?
Choquette Jack Hilaire, Stoll Robert J, Giroux Olivier, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F9/3851. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).