Efficient branch predictor history recovery in pipelined computer architectures employing branch prediction and branch delay slots of variable size

US9430245B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9430245-B2
Application numberUS-201414228436-A
CountryUS
Kind codeB2
Filing dateMar 28, 2014
Priority dateMar 28, 2014
Publication dateAug 30, 2016
Grant dateAug 30, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A pipelined processor employs branch prediction based on branch source instructions situated in a last one of a variable number of branch delay slots associated with a branch instruction. As each memory unit of data (UoD) progresses through the processing stages from one cycle to a next, a set of N branch prediction histories is built that is associated with the UoD, N being the number of cycles the branch predictor requires to produce an output. A history of evaluated branch outcomes of branch instructions that reached the branch evaluation stage is also maintained. Recovery from misprediction includes using the history of evaluated branch outcomes and the set of branch prediction histories associated with the UoD (readily available to a last stage of the pipeline) as a source of recovery histories for inputting to a branch predictor when N next memory units of data are loaded into the pipeline.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of controlling a pipelined processor having a plurality of serially connected processing stages including a first stage and a branch evaluation stage, wherein one of the processing stages other than the branch evaluation stage is a prediction start stage that supplies information for making branch predictions, wherein the pipelined processor is operatively coupled to a memory that comprises a plurality of addressable storage locations, each storage location being for storing one unit of data, wherein the pipelined processor executes instructions from an instruction set that includes a branch instruction and each branch instruction is associated with a set of branch delay slots whose size can be greater than or equal to zero and whose size can be different from one instance of a branch instruction to another, the method comprising: in a first processing cycle, supplying a first branch prediction history as input to a branch predictor that requires N cycles to produce an output; in an N th processing cycle after the first processing cycle, loading a memory unit of data into the first stage of the pipelined processor, wherein the memory unit of data includes a branch instruction; as the memory unit of data progresses through the processing stages from one cycle to a next, building a set of N branch prediction histories that are associated with the memory unit of data, wherein the N branch prediction histories comprise the first branch prediction history and N−1 branch prediction histories used to predict N−1 subsequently fetched memory units of data; maintaining a history of evaluated branch outcomes of branch instructions that reached the branch evaluation stage; detecting, based on a present evaluation of the branch instruction included in the memory unit of data and an earlier predicted outcome, whether a branch misprediction occurred; and in response to a detection that the branch misprediction occurred: selecting values of each one of N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories; and supplying respective ones of the N selected recovery histories as input to the branch predictor when each one of N next memory units of data are loaded into the prediction start stage of the pipeline, wherein a last one of the branch delay slots is a branch source of the associated branch instruction. 2. The method of claim 1 , wherein selecting values of each one of the N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories comprises: computing a distance that represents how many memory unit of data boundaries exist between the branch instruction and the branch source of the branch instruction; and selecting values of the N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories that are associated with the memory unit of data as a function of a set of parameters, wherein the set of parameters comprises the computed distance. 3. The method of claim 2 , wherein the set of parameters comprises a detection of whether the branch source of the branch instruction is a last instruction in the memory unit of data. 4. The method of claim 2 , wherein the set of parameters comprises a detection of whether the present evaluation of the branch instruction is a branch taken outcome. 5. The method of claim 1 , wherein N equals 2. 6. An apparatus for controlling a pipelined processor having a plurality of serially connected processing stages including a first stage and a branch evaluation stage, wherein one of the processing stages other than the branch evaluation stage is a prediction start stage that supplies information for making branch predictions, wherein the pipelined processor is operatively coupled to a memory that comprises a plurality of addressable storage locations, each storage location being for storing one unit of data, wherein the pipelined processor executes instructions from an instruction set that includes a branch instruction and each branch instruction is associated with a set of branch delay slots whose size can be greater than or equal to zero and whose size can be different from one instance of a branch instruction to another, the apparatus comprising: a controller configured to cause the pipelined processor to perform: in a first processing cycle, supplying a first branch prediction history as input to a branch predictor that requires N cycles to produce an output; in an N th processing cycle after the first processing cycle, loading a memory unit of data into the first stage of the pipelined processor, wherein the memory unit of data includes a branch instruction; as the memory unit of data progresses through the processing stages from one cycle to a next, building a set of N branch prediction histories that are associated with the memory unit of data, wherein the N branch prediction histories comprise the first branch prediction history and N−1 branch prediction histories used to predict N−1 subsequently fetched memory units of data; maintaining a history of evaluated branch outcomes of branch instructions that reached the branch evaluation stage; detecting, based on a present evaluation of the branch instruction included in the memory unit of data and an earlier predicted outcome, whether a branch misprediction occurred; and in response to a detection that the branch misprediction occurred: selecting values of each one of N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories; and supplying respective ones of the N selected recovery histories as input to the branch predictor when each one of N next memory units of data are loaded into the prediction start stage of the pipeline, wherein a last one of the branch delay slots is a branch source of the associated branch instruction. 7. The apparatus of claim 6 , wherein selecting values of each one of the N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories comprises: computing a distance that represents how many memory unit of data boundaries exist between the branch instruction and the branch source of the branch instruction; and selecting values of the N recovery histories from the history of evaluated branch outcomes and the set of branch prediction histories that are associated with the memory unit of data as a function of a set of parameters, wherein the set of parameters comprises the computed distance. 8. The apparatus of claim 7 , wherein the set of parameters comprises a detection of whether the branch source of the branch instruction is a last instruction in the memory unit of data. 9. The apparatus of claim 7 , wherein the set of parameters comprises a detection of whether the present evaluation of the branch instruction is a branch taken outcome. 10. The apparatus of claim 6 , wherein N equals 2. 11. A pipelined processor comprising: a plurality of serially connected processing stages including a first stage and a branch evaluation stage, wherein one of the processing stages other than the branch evaluation stage is a prediction start stage that supplies information for making branch predictions, wherein the pipelined processor executes instructions from an instruction set that includes a branch instruction and each branch instruction is associated with a set of branch delay slots whose size can be greater than or equal to zero and whose size can be different from one instance of a branch instruction to another, wherein a last one of the branch delay slots is a branch source of the associated branch instruction;

Assignees

Inventors

Classifications

  • G06F9/3844Primary

    using dynamic branch prediction, e.g. using branch history tables · CPC title

  • using instruction pipelines · CPC title

  • G06F9/3861Primary

    Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

  • using multiple copies of the architectural state, e.g. shadow registers · CPC title

  • Conditional branch instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9430245B2 cover?
A pipelined processor employs branch prediction based on branch source instructions situated in a last one of a variable number of branch delay slots associated with a branch instruction. As each memory unit of data (UoD) progresses through the processing stages from one cycle to a next, a set of N branch prediction histories is built that is associated with the UoD, N being the number of cycle…
Who is the assignee on this patent?
Ericsson Telefon Ab L M, ERICSSON TELEFON AB L M (publ)
What technology area does this patent fall under?
Primary CPC classification G06F9/3844. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).