Thread context preservation in a multithreading computer system
US-9804847-B2 · Oct 31, 2017 · US
US11126587B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11126587-B2 |
| Application number | US-201916399672-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 30, 2019 |
| Priority date | May 7, 2018 |
| Publication date | Sep 21, 2021 |
| Grant date | Sep 21, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.
Opening claim text (preview).
It is claimed: 1. A processor, comprising: a processor core adapted to execute a plurality of instructions; and a core control circuit coupled to the processor core, the core control circuit comprising: an interconnection network interface coupleable to an interconnection network to receive a work descriptor data packet, to receive an event data packet, and the interconnection network interface adapted to decode the received event data packet into an event identifier; a thread control memory comprising a plurality of registers, the plurality of registers comprising a thread identifier pool register storing a plurality of thread identifiers, a program count register storing an initial program count, a data cache storing cached data, and a general purpose register storing a received first argument; an execution queue coupled to the thread control memory, the execution queue storing all thread identifiers of the plurality of thread identifiers having a valid state; and a control logic and thread selection circuit coupled to the execution queue, the control logic and thread selection circuit adapted, in response to receiving the work descriptor data packet having the initial program count and the received first argument, to automatically assign an available thread identifier of the plurality of thread identifiers to a corresponding execution thread of a plurality of execution threads, to automatically place each assigned thread identifier of the plurality of thread identifiers in the execution queue, and to automatically and periodically select each thread identifier of the plurality of thread identifiers in the execution queue for execution by the processor core of a single instruction of the corresponding execution thread, of the plurality of instructions, the processor core automatically commencing execution of the single instruction corresponding to the initial program count and using the received first argument stored in the general purpose register. 2. The processor of claim 1 , wherein the interconnection network interface is further adapted to decode the received work descriptor data packet into the initial program count and the received first argument. 3. The processor of claim 1 , wherein the interconnection network interface is further adapted to generate a return work descriptor packet for a selected execution thread of the plurality of execution threads in response to an execution of a return instruction by the processor core to complete execution of the selected execution thread. 4. The processor of claim 1 , wherein the control logic and thread selection circuit is further adapted to automatically and repeatedly select each thread identifier of the plurality of thread identifiers in the execution queue for single instruction execution by the processor core of each execution thread while its thread identifier is in the execution queue. 5. The processor of claim 1 , wherein the interconnection network interface is further adapted to decode the received event data packet into a received second argument, and the control logic and thread selection circuit is further adapted, in response to receiving the event data packet, to automatically commence execution by the processor core of a corresponding single instruction, of the plurality of instructions, using the received second argument. 6. The processor of claim 1 , wherein the interconnection network interface is further adapted to store the initial program count and the received first argument in the thread control memory, for each execution thread of the plurality of an execution threads, using the assigned thread identifier as an index to the thread control memory. 7. The processor of claim 1 , wherein the interconnection network interface is further adapted to generate and to receive a point-to-point event data message and a broadcast event data message. 8. The processor of claim 1 , wherein the thread control memory further comprises: an event state register storing a plurality of event receive states, the plurality of event receive states comprising a receive mode, a counter or channel number, and an event data value; and an event mask register storing at least one event mask. 9. The processor of claim 8 , wherein the control logic and thread selection circuit is further adapted to use the at least one event mask stored in the event mask register to respond to the received event data packet to trigger execution or wake up a selected execution thread of the plurality of execution threads. 10. The processor of claim 1 , wherein the control logic and thread selection circuit is further adapted to determine an event number corresponding to the received event data packet. 11. The processor of claim 10 , wherein the control logic and thread selection circuit is further adapted to change the status of a selected thread identifier of the plurality of thread identifiers from a pause state to the valid state in response to the event number of the received event data packet to return the selected thread identifier to the execution queue to resume execution of the corresponding execution thread. 12. The processor of claim 1 , wherein the control logic and thread selection circuit is further adapted to change the status of a selected thread identifier of the plurality of thread identifiers from a pause state to the valid state in response to the received event data packet decrementing an event count to return the selected thread identifier to the execution queue to resume execution of the corresponding execution thread. 13. The processor of claim 1 , wherein the processor core is further adapted to execute a fiber create instruction and wherein the core control circuit is further adapted to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads, each work descriptor packet of the one or more work descriptor data packets having a program count and one or more arguments or memory addresses. 14. The processor of claim 13 , wherein the control logic and thread selection circuit is further adapted to reserve a predetermined amount of memory space in the thread control memory to store return arguments. 15. The processor of claim 1 , wherein the control logic and thread selection circuit is further adapted to assign the valid state or a pause state to each assigned thread identifier of the plurality of thread identifiers, and for as long as the valid state remains, to return the valid state thread identifier to the execution queue for continued single instruction execution by the processor core, and to pause thread execution by not returning the pause state thread identifier to the execution queue until its state has returned to valid. 16. The processor of claim 1 , wherein the thread control memory further comprises a register selected from the group consisting of: a thread state register; a pending fiber return count register; a return argument buffer or register; a return argument link list register; a custom atomic transaction identifier register; an event received mask register; an event state register; and combinations thereof. 17. The processor of claim 15 , wherein the control logic and thread selection circuit is further adapted to assign the pause state to a thread identifier, of the plurality of thread identifiers, when the processor core has executed a memory load instruction or a memory store instruction for the corresponding execution thread. 18. The processor of claim 15 , wherein the control logi
Instruction code · CPC title
data or demand driven · CPC title
with dedicated cache, e.g. instruction or stack · CPC title
Message passing systems or structures, e.g. queues · CPC title
Event management; Broadcasting; Multicasting; Notifications · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.