Memory performance when speculation control is enabled, and instruction therefor
US-2015378915-A1 · Dec 31, 2015 · US
US10067765B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10067765-B2 |
| Application number | US-201213470386-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 14, 2012 |
| Priority date | May 14, 2012 |
| Publication date | Sep 4, 2018 |
| Grant date | Sep 4, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Mechanisms are provided, in a processor, for executing instructions that are younger than a previously dispatched synchronization (sync) instruction is provided. An instruction sequencer unit of the processor dispatches a sync instruction. The sync instruction is sent to a nest of one or more devices outside of the processor. The instruction sequencer unit dispatches a subsequent instruction after dispatching the sync instruction. The dispatching of the subsequent instruction after dispatching the sync instruction is performed prior to receiving a sync acknowledgement response from the nest. The instruction sequencer unit performs a completion of the subsequent instruction based on whether completion of the subsequent instruction is dependent upon receiving the sync acknowledgement from the nest and completion of the sync instruction.
Opening claim text (preview).
What is claimed is: 1. A method, in a processor, for executing instructions that are younger than a previously dispatched synchronization (sync) instruction, comprising: dispatching, by an instruction sequencer unit of the processor, a sync instruction; sending the sync instruction to a nest of one or more devices outside of the processor; dispatching, by the instruction sequencer unit, a subsequent instruction after dispatching the sync instruction, wherein the dispatching of the subsequent instruction after dispatching the sync instruction is performed prior to receiving a sync acknowledgement response from the nest; performing, by the instruction sequencer unit, a completion of execution of the subsequent instruction and update of an architected state by committing results of the execution of the subsequent instruction to memory, based on whether completion of the subsequent instruction is dependent upon receiving the sync acknowledgement from the nest and completion of the sync instruction, wherein the subsequent instruction is one of a store instruction or a load instruction, and wherein: in response to the subsequent instruction being a store instruction, performing completion of execution of the subsequent instruction and update of the architected state comprises completing execution of the store instruction and updating the architected state by committing results of the execution of the store instruction to memory after completion of the sync instruction but prior to receiving the sync acknowledgement from the nest, and in response to the subsequent instruction being a load instruction, performing completion of execution of the subsequent instruction and update of the architected state comprises delaying completion of the execution of the subsequent instruction until after completion of the sync instruction and after receiving the sync acknowledgement from the nest; and in response to dispatching the sync instruction, setting a sync_pending status bit to indicate that there is an outstanding sync instruction that has not been acknowledged, wherein performing a completion of execution of the subsequent instruction and update of the architected state further comprises: determining whether the sync_pending status bit is set; determining whether a younger_load_dispatched bit is set or not set, wherein the younger_load_dispatched bit is set in response to a load instruction being dispatched after the sync instruction is dispatched by the instruction sequencer unit; and in response to determining that the sync_pending status bit is set, and the younger_load_dispatched bit is not set, setting a sync_dependent bit in a global completion table in association with the subsequent instruction to thereby indicate that the subsequent instruction is dependent upon the sync instruction acknowledgement. 2. The method of claim 1 , wherein, in response to the subsequent instruction being a load instruction and in response to receiving the sync acknowledgement from the nest, performing completion of execution of the subsequent instruction and update of the architected state further comprises: checking, by a load store unit of the processor, for a snoop hit on the load instruction; in response to there not being a snoop hit on the load instruction, completing execution of the load instruction; and in response to there being a snoop hit on the load instruction, requesting, by the load store unit, a flush of a processor pipeline at a point of an oldest snooped load instruction that is younger than the sync instruction. 3. The method of claim 1 , wherein all subsequent instructions that are non-load/non-sync instructions younger than the sync instruction are permitted to complete execution and update the architected state without waiting to receive the sync acknowledgement. 4. The method of claim 1 , wherein: dispatching, by an instruction sequencer unit of the processor, a sync instruction further comprises setting a sync pending status bit; and dispatching, by the instruction sequencer unit, a subsequent instruction after dispatching the sync instruction further comprises setting a status bit corresponding to the dispatched subsequent instruction. 5. The method of claim 4 , wherein: performing a completion of execution of the subsequent instruction and update of the architected state comprises determining if a next to complete instruction is a load instruction or a sync instruction; in response to a next to complete instruction being a load instruction or a sync instruction, determining if the sync pending status bit is set or not; and completing, or stalling completion, of execution of the subsequent instruction and updating the architected state based on results of determining if the sync pending status bit is set or not. 6. The method of claim 5 , wherein if the sync pending status bit is set, then completion of execution of the subsequent instruction and update of the architected state is stalled until the sync acknowledgement pending bit is reset. 7. The method of claim 5 , wherein if the sync pending status bit is not set, then completion of execution of the subsequent instruction and update of the architected state is performed and the sync acknowledgement bit is reset. 8. A processor, comprising: first hardware logic, in an instruction sequencer unit of the processor, configured to dispatch a sync instruction; second hardware logic, in the instruction sequencer unit, configured to send the sync instruction to a nest of one or more devices outside of the processor; third hardware logic, in the instruction sequencer unit, configured to dispatch a subsequent instruction after dispatching the sync instruction, wherein the dispatching of the subsequent instruction after dispatching the sync instruction is performed prior to receiving a sync acknowledgement response from the nest; fourth hardware logic, in the instruction sequencer unit, configured to perform a completion of execution of the subsequent instruction and update of an architected state by committing results of the execution of the subsequent instruction to memory, based on whether completion of the subsequent instruction is dependent upon receiving the sync acknowledgement from the nest and completion of the sync instruction, wherein the subsequent instruction is one of a store instruction or a load instruction, and wherein: in response to the subsequent instruction being a store instruction, the fourth hardware logic is configured to perform completion of execution of the subsequent instruction and update of the architected state comprises completing execution of the store instruction and updating the architected state by committing results of the execution of the store instruction to memory after completion of the sync instruction but prior to receiving the sync acknowledgement from the nest, and in response to the subsequent instruction being a load instruction, the fourth hardware logic is configured to perform completion of execution of the subsequent instruction and update of the architected state comprises delaying completion of the execution of the subsequent instruction until after completion of the sync instruction and after receiving the sync acknowledgement from the nest; and in response to dispatching the sync instruction, the third hardware logic is configured to set a sync_pending status bit to indicate that there is an outstanding sync instruction that has not been acknowledged, wherein performing a completion of execution of the subsequent instruction and update of the architected state further comprises: determining whether the sync_pending status bit is set; determining whether a younger_load_dispatched bit is set or not set, wherein the younger_load_dispatched bit
Synchronisation or serialisation instructions · CPC title
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title
Maintaining memory consistency · CPC title
Operand accessing · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.