Execution of instruction loops using an instruction buffer

US9710276B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9710276-B2
Application numberUS-201213673244-A
CountryUS
Kind codeB2
Filing dateNov 9, 2012
Priority dateNov 9, 2012
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To execute the loop, the uOp buffer repeatedly dispatches the uOps associated with the loop's instructions until the end condition of the loop is met and the uOp buffer exits the loop mode.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: in response to detecting an instruction loop at an instruction pipeline of a processor: for a plurality of iterations of the instruction loop: providing micro-operations of the instruction loop from a first buffer at a first stage of the instruction pipeline to execution units of the instruction pipeline; determining, for each micro-operation, whether a subset of instruction bytes associated with a corresponding micro-operation is required to be provided directly from a second buffer to the first buffer for the execution units; separately providing the subset of the instruction bytes for instructions of the instruction loop from the second buffer to the first buffer, such that the subset of the instruction bytes are re-provided from the second buffer to the first buffer for each of the plurality of iterations of the instruction loop; and placing a second stage of the instruction pipeline in a reduced power state, the second stage providing the micro-operations of the instruction loop to the first buffer. 2. The method of claim 1 , wherein placing the second stage of the instruction pipeline in the reduced power state comprises: in response to detecting the instruction loop at the instruction pipeline, suspending provision of micro-operations from a decode stage to the first buffer. 3. The method of claim 1 , wherein placing the second stage of the instruction pipeline in the reduced power state comprises suppressing access to the second stage. 4. The method of claim 1 , wherein the second stage is a fetch stage of the instruction pipeline. 5. The method of claim 1 , wherein the second stage is a decode stage of the instruction pipeline. 6. The method of claim 1 , wherein detecting the instruction loop at the instruction pipeline comprises: determining a first instruction is marked as indicating a start of the instruction loop; determining a second instruction is marked as an end of the instruction loop; and determining that a storage size of a set of micro-operations decoded from instructions between the first instruction and the second instruction in a program sequence are within a storage capacity of the first buffer. 7. The method of claim 6 , wherein detecting the instruction loop at the instruction pipeline further comprises: determining a first instruction is marked as indicating a start of the instruction loop; determining a second instruction is marked as an end of the instruction loop; and determining that a storage size of a set of instruction bytes provided by a fetch stage of the instruction pipeline and determined from instructions between the first instruction and the second instruction in a program sequence are within a storage capacity of the second buffer. 8. The method of claim 1 , wherein providing micro-operations of the instruction loop from the first buffer at a first stage of the instruction pipeline to execution units of the instruction pipeline comprises placing the first buffer into a first loop mode prior to placing the second buffer in a second loop mode wherein the second buffer provides instruction bytes to the first buffer. 9. The method of claim 1 , wherein the first buffer comprises a first bank and a second bank, and wherein: providing the micro-operations from the first buffer comprises alternating provision of the micro-operations from the first bank and the second bank based on a set of pointers stored at a third buffer; and providing instruction bytes from the second buffer to the first buffer comprises: alternating the pointers stored at the third buffer based on a number of iterations of the loop. 10. A method comprising: providing instruction bytes from a first buffer to a decode stage of an instruction pipeline of a processor, the decode stage providing micro-operations to a second buffer for execution; and in response to detecting that an instruction loop is to be executed at the instruction pipeline, suspending provision of instruction bytes from the first buffer to the decode stage; and directly providing a subset of the instruction bytes from the first buffer to the second buffer for execution with a micro-operation corresponding to the subset. 11. The method of claim 10 , wherein a subset of the instruction bytes includes address displacement information. 12. The method of claim 10 , wherein a subset of the instruction bytes includes immediate operand information. 13. The method of claim 10 , wherein the first buffer receives the instruction bytes from a fetch stage of the instruction pipeline. 14. The method of claim 10 , further comprising resuming provision of instruction bytes from the first buffer to the decode stage in response to detecting an end condition of the instruction loop. 15. A processor comprising: an instruction pipeline comprising: a loop detector that detects an instruction loop to be executed at the instruction pipeline; a fetch stage; a decode stage; an execution stage; a dispatch stage including a first buffer, the dispatch stage receiving micro-operations from the decode stage and storing the micro-operations at the buffer, and providing, for each of a plurality of iterations of the instruction loop, the stored micro-operations from the first buffer to the execution stage in response to the loop detector detecting the instruction loop; a second buffer that: receives instruction bytes from the fetch stage; provides instruction bytes to the decode stage when the loop detector indicates a loop is not being executed; in response to the loop detector detecting the instruction loop: suspends provision of a first subset of the instruction bytes to the decode stage; determines, for each micro-operation, whether a subset of the instruction bytes associated with a micro-operation is required to be provided directly from a second buffer to the first butter for the execution stage; and provides a second subset of the instruction bytes to the first buffer, such that the second subset is re-provided from the second buffer to the first buffer for each iteration of the instruction loop; and a power control module that places a selected stage of the instruction pipeline in a reduced power state in response to the loop detector detecting the instruction loop. 16. The processor of claim 15 , wherein the dispatch stage, in response to the loop detector detecting the instruction loop, suspends retrieval of micro-operations from the decode stage. 17. The processor of claim 15 , wherein the power control module places the selected stage of the instruction pipeline in the reduced power state by suppressing access to the selected stage. 18. The processor of claim 15 , wherein the selected stage is the fetch stage. 19. The processor of claim 15 wherein the selected stage is the decode stage. 20. The processor of claim 15 , wherein the dispatch stage repeatedly provides the micro-operations from the first buffer in response to detecting that all of the micro-operations of the instruction loop are within a storage capacity of the first buffer. 21. The processor of claim 15 , wherein the first buffer provides the stored micro-operations while in a first loop mode, and wherein the second buffer suspends provision of the first subset of instruction bytes to the decode stage while in a second loop mode, the second buffer to enter the second loop mode after the first buffer has entered the first loop mode. 22. The processor of claim 15 , wher

Assignees

Inventors

Classifications

  • for loops, e.g. loop detection or loop counter · CPC title

  • G06F9/381Primary

    Loop buffering · CPC title

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • Instruction prefetching · CPC title

  • for instruction reuse, e.g. trace cache, branch target cache · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710276B2 cover?
In a normal, non-loop mode a uOp buffer receives and stores for dispatch the uOps generated by a decode stage based on a received instruction sequence. In response to detecting a loop in the instruction sequence, the uOp buffer is placed into a loop mode whereby, after the uOps associated with the loop have been stored at the uOp buffer, storage of further uOps at the buffer is suspended. To ex…
Who is the assignee on this patent?
Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/381. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).