MFENCE and LFENCE micro-architectural implementation method and system
US-8959314-B2 · Feb 17, 2015 · US
US9342310B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9342310-B2 |
| Application number | US-201313838229-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 15, 2013 |
| Priority date | Dec 30, 1999 |
| Publication date | May 17, 2016 |
| Grant date | May 17, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: an instruction prefetch unit to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; an instruction fetch unit to fetch a memory load fence (LFENCE) instruction that does not use a mask field thereof, a memory fence (MFENCE) instruction that does not use a mask field thereof, and a cache line flush (CLFLUSH) instruction; a first memory ordering portion of the processor responsive to the LFENCE instruction to prevent newer memory load instructions occurring after the LFENCE instruction in program order from being globally visible before older memory load instructions occurring before the LFENCE instruction in the program order are globally visible without causing the processor to stall dispatch of a newer memory store instruction occurring after the LFENCE instruction in the program order; and a second memory ordering portion of the processor responsive to the MFENCE instruction to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from being globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 2. The processor of claim 1 , wherein the LFENCE instruction and MFENCE instructions are treated as non-operations (NOPs) by the processor after being dispatched once the older memory load instructions are globally visible, and wherein the older memory load instructions are globally visible but not necessarily completed. 3. The processor of claim 1 , wherein the LFENCE instruction and MFENCE instructions comprise macroinstructions. 4. The processor of claim 1 , wherein the MFENCE instruction is to cause the processor to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in the program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible. 5. The processor of claim 1 , implemented in a computer system also including a graphics processor. 6. The processor of claim 1 , wherein the LFENCE instruction does not use a data field thereof. 7. The processor of claim 1 , wherein the MFENCE instruction does not use a data field thereof. 8. A processor comprising: an instruction prefetch unit to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; and a decoder to decode instructions including a memory fence (MFENCE) instruction wherein the MFENCE instruction does not use a mask field thereof and a cache line flush (CLFLUSH) instruction, the MFENCE instruction to cause the processor to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible, the MFENCE instruction further to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from becoming globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 9. The processor of claim 8 , wherein the MFENCE instruction does not use a mask field thereof, and wherein the older memory load instructions are globally visible but not necessarily completed. 10. The processor of claim 8 , wherein the MFENCE instruction is treated as a non-operation (NOP) by the processor after being dispatched after all of the older memory load instructions and memory store instructions are globally visible. 11. The processor of claim 8 , wherein the MFENCE instruction is also to cause the processor to ensure that an older CLFLUSH instruction that is older than the MFENCE instruction in the program order is globally visible before a newer CLFLUSH instruction that is newer than the MFENCE instruction in the program order is globally visible. 12. The processor of claim 8 , wherein the MFENCE instruction comprises a macroinstruction. 13. The processor of claim 8 , implemented in a computer system also including a graphics processor. 14. The processor of claim 8 , wherein the MFENCE instruction does not use a data field thereof. 15. A processor comprising: instruction prefetch circuitry to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; an instruction fetch unit to fetch a load fence (LFENCE) instruction that does not use a mask field thereof and a memory fence (MFENCE) instruction that does not use a mask field thereof; and a decoder to decode the LFENCE instruction and to decode the MFENCE instruction, wherein a portion of the processor is responsive to the LFENCE instruction to prevent newer memory load instructions occurring after the LFENCE instruction in program order from being globally visible before older memory load instructions occurring before the LFENCE instruction in the program order are globally visible, without causing the processor to stall dispatch of a newer memory store instruction occurring after the LFENCE instruction in the program order, and the portion of the processor responsive to the MFENCE instruction is to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in the program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible, the portion of the processor responsive to the MFENCE instruction further to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from becoming globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 16. The processor of claim 15 , wherein the LFENCE instruction and the MFENCE instruction are each treated as a non-operation (NOP) after being dispatched. 17. The processor of claim 15 , wherein the MFENCE instruction is to guarantee strong ordering with respect to a cache line flush instruction but the LFENCE instruction is not to guarantee strong ordering with respect to the cache line flush instruction. 18. The processor of claim 15 , wherein the LFENCE instruction does not use a data field thereof. 19. The processor of claim 15 , wherein the MFENCE instruction does not use a data field thereof.
Physics · mapped topic
Prefetch instructions; cache control instructions · CPC title
Instruction analysis, e.g. decoding, instruction word fields · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.