MFENCE and LFENCE micro-architectural implementation method and system

US9342310B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9342310-B2
Application numberUS-201313838229-A
CountryUS
Kind codeB2
Filing dateMar 15, 2013
Priority dateDec 30, 1999
Publication dateMay 17, 2016
Grant dateMay 17, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older access instructions are gradually retired. When all older memory accesses are retired, the fencing instruction is dispatched from the buffer.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: an instruction prefetch unit to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; an instruction fetch unit to fetch a memory load fence (LFENCE) instruction that does not use a mask field thereof, a memory fence (MFENCE) instruction that does not use a mask field thereof, and a cache line flush (CLFLUSH) instruction; a first memory ordering portion of the processor responsive to the LFENCE instruction to prevent newer memory load instructions occurring after the LFENCE instruction in program order from being globally visible before older memory load instructions occurring before the LFENCE instruction in the program order are globally visible without causing the processor to stall dispatch of a newer memory store instruction occurring after the LFENCE instruction in the program order; and a second memory ordering portion of the processor responsive to the MFENCE instruction to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from being globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 2. The processor of claim 1 , wherein the LFENCE instruction and MFENCE instructions are treated as non-operations (NOPs) by the processor after being dispatched once the older memory load instructions are globally visible, and wherein the older memory load instructions are globally visible but not necessarily completed. 3. The processor of claim 1 , wherein the LFENCE instruction and MFENCE instructions comprise macroinstructions. 4. The processor of claim 1 , wherein the MFENCE instruction is to cause the processor to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in the program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible. 5. The processor of claim 1 , implemented in a computer system also including a graphics processor. 6. The processor of claim 1 , wherein the LFENCE instruction does not use a data field thereof. 7. The processor of claim 1 , wherein the MFENCE instruction does not use a data field thereof. 8. A processor comprising: an instruction prefetch unit to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; and a decoder to decode instructions including a memory fence (MFENCE) instruction wherein the MFENCE instruction does not use a mask field thereof and a cache line flush (CLFLUSH) instruction, the MFENCE instruction to cause the processor to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible, the MFENCE instruction further to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from becoming globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 9. The processor of claim 8 , wherein the MFENCE instruction does not use a mask field thereof, and wherein the older memory load instructions are globally visible but not necessarily completed. 10. The processor of claim 8 , wherein the MFENCE instruction is treated as a non-operation (NOP) by the processor after being dispatched after all of the older memory load instructions and memory store instructions are globally visible. 11. The processor of claim 8 , wherein the MFENCE instruction is also to cause the processor to ensure that an older CLFLUSH instruction that is older than the MFENCE instruction in the program order is globally visible before a newer CLFLUSH instruction that is newer than the MFENCE instruction in the program order is globally visible. 12. The processor of claim 8 , wherein the MFENCE instruction comprises a macroinstruction. 13. The processor of claim 8 , implemented in a computer system also including a graphics processor. 14. The processor of claim 8 , wherein the MFENCE instruction does not use a data field thereof. 15. A processor comprising: instruction prefetch circuitry to prefetch a cache line of data responsive to a PREFETCH instruction and to store the cache line in a data cache; an instruction fetch unit to fetch a load fence (LFENCE) instruction that does not use a mask field thereof and a memory fence (MFENCE) instruction that does not use a mask field thereof; and a decoder to decode the LFENCE instruction and to decode the MFENCE instruction, wherein a portion of the processor is responsive to the LFENCE instruction to prevent newer memory load instructions occurring after the LFENCE instruction in program order from being globally visible before older memory load instructions occurring before the LFENCE instruction in the program order are globally visible, without causing the processor to stall dispatch of a newer memory store instruction occurring after the LFENCE instruction in the program order, and the portion of the processor responsive to the MFENCE instruction is to ensure that all older memory load instructions and all older memory store instructions, which are each older than the MFENCE instruction in the program order, are globally visible, before all newer memory load instructions and all newer memory store instructions, which are each newer than the MFENCE instruction in the program order, are globally visible, the portion of the processor responsive to the MFENCE instruction further to prevent a CLFLUSH instruction which follows the MFENCE instruction in program order from becoming globally visible until a PREFETCH instruction preceding the MFENCE instruction in program order has become globally visible. 16. The processor of claim 15 , wherein the LFENCE instruction and the MFENCE instruction are each treated as a non-operation (NOP) after being dispatched. 17. The processor of claim 15 , wherein the MFENCE instruction is to guarantee strong ordering with respect to a cache line flush instruction but the LFENCE instruction is not to guarantee strong ordering with respect to the cache line flush instruction. 18. The processor of claim 15 , wherein the LFENCE instruction does not use a data field thereof. 19. The processor of claim 15 , wherein the MFENCE instruction does not use a data field thereof.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Prefetch instructions; cache control instructions · CPC title

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

  • G06F9/3836Primary

    Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9342310B2 cover?
A system and method for fencing memory accesses. Memory loads can be fenced, or all memory access can be fenced. The system receives a fencing instruction that separates memory access instructions into older accesses and newer accesses. A buffer within the memory ordering unit is allocated to the instruction. The access instructions newer than the fencing instruction are stalled. The older acce…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30145. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 17 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).