System and method for load and store queue allocations at address generation time

US11086628B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11086628-B2
Application numberUS-201615236882-A
CountryUS
Kind codeB2
Filing dateAug 15, 2016
Priority dateAug 15, 2016
Publication dateAug 10, 2021
Grant dateAug 10, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for load queue (LDQ) and store queue (STQ) entry allocations at address generation time that maintains age-order of instructions is described. In particular, writing LDQ and STQ entries are delayed until address generation time. This allows the load and store operations to dispatch, and younger operations (which may not be store and load operations) to also dispatch and execute their instructions. The address generation of the load or store operation is held at an address generation scheduler queue (AGSQ) until a load or store queue entry is available for the operation. The tracking of load queue entries or store queue entries is effectively being done in the AGSQ instead of at the decode engine. The LDQ and STQ depth is not visible from a decode engine's perspective, and increases the effective processing and queue depth.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing micro-operations, the method comprising: fetching a plurality of micro-operations, wherein the plurality of micro-operations includes a first micro-operation; in response to a queue entry in a load/store queue not being available for the first micro-operation, dispatching the first micro-operation to an age-ordered scheduler queue without dispatching the first micro-operation to the load/store queue, wherein the age-ordered scheduler queue holds a dispatch payload associated with the first micro-operation; and in response to a queue entry in the load/store queue identified by the first micro-operation becoming available and source information required by the first micro-operation becoming ready: performing address generation for the first micro-operation; reading the dispatch payload including the first micro-operation; and sending the dispatch payload including data to be stored as specified by the first micro-operation to the available queue entry identified by the first micro-operation in the load/store queue. 2. The method of claim 1 , further comprising: associating, at dispatch time, each micro-operation with a queue entry in the load/store queue in program order to maintain age-order. 3. The method of claim 1 , further comprising: updating an oldest uncommitted micro-operation queue entry number based on input from the load/store queue. 4. The method of claim 3 , further comprising: comparing the oldest uncommitted micro-operation queue entry to one or more associated load/store queue entries for each of the micro-operations to determine if the micro-operation's queue entry is available. 5. The method of claim 3 , wherein the micro-operation is a store micro-operation. 6. The method of claim 1 , wherein a dispatch window size is based on a depth of the load/store queue. 7. The method of claim 1 , wherein different epochs of the load/store queue are tracked to enable dispatching more than a depth of the load/store queue, wherein each epoch represents a cycle of the load/store queue. 8. The method of claim 1 , further comprising: fetching a plurality of micro-operations, wherein the plurality of micro-operations includes a second micro-operation; in response to a queue entry in the load/store queue being available for the second micro-operation: performing address generation for the second micro-operation; reading the dispatch payload including the second micro-operation; and sending the dispatch payload including data to be stored as specified by the second micro-operation to the available queue entry identified by the second micro-operation in the load/store queue. 9. A processor for processing micro-operations, comprising: a load/store queue; an age-ordered scheduler queue; an address generation scheduler; and a decoder, wherein: the decoder is configured to fetch a plurality of micro-operations, wherein the plurality of micro-operations includes a first micro-operation; in response to a queue entry in a load/store queue not being available for the first micro-operation, the decoder is configured to dispatch the first micro-operation to the age-ordered scheduler queue without dispatching the first micro-operation to the load/store queue; the age-ordered scheduler queue configured to hold a dispatch payload associated with the first micro-operation; and the address generation scheduler is configured to determine a queue entry in the load/store queue identified by the first micro-operation becoming available and source information required by the first micro-operation becoming ready, in response to the address generation scheduler determining that the queue entry in the load/store queue identified by the first micro-operation becoming available and the source information required by the first micro-operation becoming ready, the address generation scheduler is further configured to: perform address generation for the first micro-operation, read the dispatch payload including the first micro-operation, and send the dispatch payload including data to be stored as specified by the first micro-operation to the available queue entry identified by the first micro-operation in the load/store queue. 10. The processor of claim 9 , wherein the decoder is further configured to associate each micro-operation with a queue entry in the load/store queue in program order to maintain age-order of the micro-operations. 11. The processor of claim 10 , wherein the load/store queue is configured to notify the age-ordered scheduler queue that the associated queue entry is available. 12. The processor of claim 9 , wherein the scheduler is configured to: update an oldest uncommitted micro-operation queue entry number based on input from the load/store queue; and compare an oldest uncommitted micro-operation queue entry to one or more associated load/store queue entries for each of the micro-operations to determine if the micro-operation's queue entry is available. 13. The processor of claim 9 , wherein the micro-operation is a store micro-operation. 14. The processor of claim 9 , wherein a dispatch window size is based on at least one of a depth of the load/store queue and a depth of the age-ordered scheduler queue. 15. The processor of claim 9 , wherein different epochs of the load/store queue are tracked to enable dispatching more than a depth of the load/store queue, each epoch represents a cycle of the load/store queue. 16. The processor of claim 10 : wherein the decoder is configured to fetch a plurality of micro-operations, wherein the plurality of micro-operations includes a second micro-operation; in response to the address generation scheduler determining that the queue entry in the load/store queue identified by the second micro-operation becoming available and the source information required by the second micro-operation becoming ready, the address generation scheduler is further configured to: perform address generation for the second micro-operation, read the dispatch payload including the second micro-operation, and send the dispatch payload including data to be stored as specified by the second micro-operation to the available queue entry identified by the second micro-operation in the load/store queue. 17. A method for processing micro-operations, the method comprising: in response to a queue entry in a load/store queue not being available for a first micro-operation, dispatching the first micro-operation to a scheduler queue for holding a dispatch payload associated with a plurality of micro-operations without dispatching the first micro-operation to the load/store queue, wherein the plurality of micro-operations include a first micro-operation; storing the dispatch payload in the scheduler queue; determining whether a queue entry in the load/store queue identified by the first micro-operation is available and whether source information required by the first micro-operation is ready; and in response to the queue entry in the load/store queue becoming available and source information required by the first micro-operation becoming ready: performing address generation for the first micro-operation; reading the dispatch payload including the first micro-operation; and sending the dispatch payload including data to be stored as specified by the first micro-operation to the available queue entry identified by the first micro-operation in the load/store queue. 18. The method of claim 17 , further comprising: assigning a queue entry in program order to each m

Assignees

Inventors

Classifications

  • G06F9/3824Primary

    Operand accessing · CPC title

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Maintaining memory consistency · CPC title

  • G06F9/3836Primary

    Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

  • G06F9/3802Primary

    Instruction prefetching · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11086628B2 cover?
A system and method for load queue (LDQ) and store queue (STQ) entry allocations at address generation time that maintains age-order of instructions is described. In particular, writing LDQ and STQ entries are delayed until address generation time. This allows the load and store operations to dispatch, and younger operations (which may not be store and load operations) to also dispatch and exec…
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/3824. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 10 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).