Efficient store-forwarding with partitioned FIFO store-reorder queue in out-of-order processor

US10579387B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10579387-B2
Application numberUS-201715726575-A
CountryUS
Kind codeB2
Filing dateOct 6, 2017
Priority dateOct 6, 2017
Publication dateMar 3, 2020
Grant dateMar 3, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technical solutions are described for executing one or more out-of-order (OoO) instructions by a processing unit. The execution includes detecting, by a load-store unit (LSU), a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses. The detecting includes determining an effective address associated with an operand of a load instruction. The detecting further includes determining whether a store instruction entry using said effective address to store a data value is present in a store reorder queue, and indicating that an LHS has been detected based at least in part on determining that store instruction entry using said effective address is present in the store reorder queue. In response to detecting the LHS, a store forwarding is performed that includes forwarding data from the store instruction to the load instruction.

First claim

Opening claim text (preview).

What is claimed is: 1. A processing unit for executing one or more instructions, the processing unit comprising: a load-store unit (LSU) for transferring data between memory and registers, the LSU configured to execute instructions from an out-of-order (OoO) instructions window, the execution comprising: detecting a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses, the detecting comprising: determining an effective address associated with an operand of a load instruction; determining whether a store instruction entry using said effective address to store a data value is present in a store reorder queue; determining whether a thread identifier from the store instruction entry and the load instruction match each other, in response to the processing unit operating in simultaneous multi-threaded mode; and indicating that an LHS has been detected based at least in part on determining that store instruction entry using said effective address is present in the store reorder queue; and in response to detecting the LHS, performing a store forwarding comprising forwarding data from the store instruction to the load instruction. 2. The processing unit of claim 1 , wherein the LSU is further configured to determine, in response to the LHS detection, that the store instruction is older than the load instruction in program order. 3. The processing unit of claim 1 , wherein determining whether the store instruction entry using said effective address is present in the store reorder queue further comprises: determining whether an effective real translation table index for the store instruction entry and the load instruction match each other. 4. The processing unit of claim 1 , wherein entries in the store reorder queue are added and executed in first-in-first-out (FIFO) order. 5. The processing unit of claim 1 , wherein the store instruction entry comprises a thread identifier, an effective address, and an effective real translation table identifier associated with the store instruction issued by the LSU. 6. The processing unit of claim 1 , wherein the store reorder queue comprises a number of partitions, one partition for each store instruction issued concurrently by the load-store unit. 7. A computer-implemented method for executing one or more out-of-order (OoO) instructions by a processing unit, the method comprising: detecting, by a load-store unit (LSU), a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses, the detection comprising: determining an effective address associated with an operand of a load instruction; determining whether a store instruction entry using said effective address to store a data value is present in a store reorder queue; determining whether a thread identifier from the store instruction entry and the load instruction match each other, in response to the processing unit operating in simultaneous multi-threaded mode; and indicating that an LHS has been detected based at least in part on determining that store instruction entry using said effective address is present in the store reorder queue; and in response to detecting the LHS, performing a store forwarding comprising forwarding data from the store instruction to the load instruction. 8. The computer-implemented method of claim 7 , further comprising determining, in response to the LHS detection, that the store instruction is older than the load instruction in program order. 9. The computer-implemented method of claim 7 , wherein determining whether the store instruction entry using said effective address is present in the store reorder queue further comprises: determining whether an effective real translation table index for the store instruction entry and the load instruction match each other. 10. The computer-implemented method of claim 7 , wherein entries in the store reorder queue are added and executed in first-in-first-out (FIFO) order. 11. The computer-implemented method of claim 7 , wherein the store instruction entry comprises a thread identifier, an effective address, and an effective real translation table identifier associated with the store instruction issued by the LSU. 12. The computer-implemented method of claim 7 , wherein the store reorder queue comprises a number of partitions, one partition for each store instruction issued concurrently by the load-store unit. 13. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising: detecting, by a load-store unit (LSU), a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses by, the detecting comprising: determining an effective address associated with an operand of a load instruction; determining whether a store instruction entry using said effective address to store a data value is present in a store reorder queue; determining whether a thread identifier from the store instruction entry and the load instruction match each other, in response to the processor operating in simultaneous multi-threaded mode; and indicating that an LHS has been detected based at least in part on determining that store instruction entry using said effective address is present in the store reorder queue; and in response to detecting the LHS, performing a store forwarding comprising forwarding data from the store instruction to the load instruction. 14. The computer program product of claim 13 , the operations further comprising determining, in response to the LHS detection, that the store instruction is older than the load instruction in program order. 15. The computer program product of claim 13 , wherein determining whether the store instruction entry using said effective address is present in the store reorder queue further comprises: determining whether an effective real translation table index for the store instruction entry and the load instruction match each other. 16. The computer program product of claim 13 , wherein the store instruction entry comprises a thread identifier, an effective address, and an effective real translation table identifier associated with the store instruction issued by the LSU. 17. The computer program product of claim 13 , wherein the store reorder queue comprises a number of partitions, one partition for each store instruction issued concurrently by the load-store unit.

Assignees

Inventors

Classifications

  • Dependency mechanisms, e.g. register scoreboarding · CPC title

  • Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

  • G06F9/3834Primary

    Maintaining memory consistency · CPC title

  • Pipeline control instructions, e.g. multicycle NOP · CPC title

  • G06F9/3836Primary

    Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10579387B2 cover?
Technical solutions are described for executing one or more out-of-order (OoO) instructions by a processing unit. The execution includes detecting, by a load-store unit (LSU), a load-hit-store (LHS) in an out-of-order execution of the instructions, the detecting based only on effective addresses. The detecting includes determining an effective address associated with an operand of a load instru…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/3834. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).