High performance zero bubble conditional branch prediction using micro branch target buffer

US10402200B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10402200-B2
Application numberUS-201615047617-A
CountryUS
Kind codeB2
Filing dateFeb 18, 2016
Priority dateJun 26, 2015
Publication dateSep 3, 2019
Grant dateSep 3, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments include a micro BTB, which can predict up to two branches per cycle, every cycle, with zero bubble insertion on either a taken or not taken prediction, thereby significantly improving performance and reducing power consumption of a microprocessor. A front end of a microprocessor can include a main front end logic section having a main BTB, a micro BTB to produce prediction information, and a decoupling queue. The micro BTB can include a graph having multiple entries, and a CAM having multiple items. Each of the entries of the graph can include a link pointer to a next branch in a taken direction, and a link pointer to a next branch in a not-taken direction. The micro BTB can insert a hot branch into the graph as a new seed.

First claim

Opening claim text (preview).

What is claimed is: 1. A front end of a microprocessor, comprising: a main front end logic section including a main branch target buffer (BTB); a micro BTB separate from the main BTB of the main front end logic section, and configured to produce prediction information; and a decoupling queue coupled to the micro BTB and to the main front end logic section, and configured to receive and queue the prediction information, and to provide the queued prediction information to the main front end logic section, wherein the micro BTB comprises: a graph including one or more entries; and a content addressable memory (CAM) including one or more items; wherein each of the one or more items of the CAM of the micro BTB includes a weight that is configured to indicate whether a branch in a given item of the CAM is to be inserted into the graph of the micro BTB as a new seed. 2. The front end of the microprocessor of claim 1 , wherein: the main front end logic section is configured to fetch a program; and each of the one or more entries of the graph of the micro BTB represents a corresponding branch inside an inner portion of the program. 3. The front end of the microprocessor of claim 2 , wherein: each of the one or more entries of the graph of the micro BTB includes a first link pointer to a first next branch in a taken direction, and a second link pointer to a second next branch in a not-taken direction. 4. The front end of the microprocessor of claim 3 , wherein: each of the one or more entries of the graph of the micro BTB includes a first valid bit associated with the first link pointer to the first next branch in the taken direction, and a second valid bit associated with the second link pointer to the second next branch in the not-taken direction; and the graph of the micro BTB is configured to set the first valid bit when the first link pointer is valid, and to set the second valid bit when the second link pointer is valid. 5. The front end of the microprocessor of claim 4 , wherein: each of the one or more entries of the graph of the micro BTB includes a next prediction bit configured to indicate whether to follow the first link pointer to the first next branch to be predicted, or the second link pointer to the second next branch to be predicted. 6. The front end of the microprocessor of claim 2 , wherein: each of the one or more entries of the graph of the micro BTB includes a pair bit configured to indicate that two branches should be predicted in parallel. 7. The front end of the microprocessor of claim 1 , wherein: each of the one or more items of the CAM of the micro BTB includes a status bit that is configured to indicate whether the branch in the given item of the CAM is already present or not in the graph of the micro BTB. 8. The front end of the microprocessor of claim 7 , wherein: the micro BTB is configured to insert the branch in the given item of the CAM into the graph as the new seed when the status bit indicates that the branch in the given item of the CAM is not already present in the graph; and the micro BTB is configured to not insert the branch in the given item of the CAM into the graph as the new seed when the status bit indicates that the branch in the given item of the CAM is already present in the graph. 9. The front end of the microprocessor of claim 1 , wherein: each of the one or more items of the CAM of the micro BTB includes a valid bit that is configured to indicate whether a given item of the CAM has been allocated into the graph of the micro BTB. 10. The front end of the microprocessor of claim 1 , wherein: the main front end logic section includes a main predictor; and the micro BTB comprises a conditional branch predictor. 11. The front end of the microprocessor of claim 10 , wherein the conditional branch predictor of the micro BTB comprises: a static prediction state in which branches that have always resolved as taken are in an always taken state and are predicted as taken until either the main predictor or an execution unit redirect a prediction of the micro BTB to a not taken state. 12. The front end of the microprocessor of claim 10 , wherein the conditional branch predictor of the micro BTB comprises: a highly biased conditional branch prediction state in which branches that exhibit dynamic behavior, but exhibit long runs of taken or not-taken branches, are classified as either mostly taken or mostly not taken branches. 13. The front end of the microprocessor of claim 10 , wherein the conditional branch predictor of the micro BTB comprises: a loop conditional branch prediction state in which branches that exhibit dynamic behavior, but have a repeating sequence of taken outcomes having a number that is less than or equal to a threshold followed by a single not taken outcome, are classified as loops and are predicted by a loop predictor. 14. The front end of the microprocessor of claim 10 , wherein the conditional branch predictor of the micro BTB comprises: an anti-loop conditional branch prediction state in which branches that exhibit dynamic behavior, but have a repeating sequence of not taken outcomes having a number that is less than or equal to a threshold followed by a single taken outcome, are classified as anti-loops and are predicted by an anti-loop predictor. 15. A computer-implemented method for performing zero bubble conditional branch prediction for a main front end logic section of a microprocessor using a micro branch target buffer (BTB), the method comprising: producing, by the micro BTB, prediction information that is separate from prediction information produced by a main BTB of the main front end logic section of the microprocessor; receiving, by a decoupling queue, the prediction information from the micro BTB; queuing, by the decoupling queue, the prediction information from the micro BTB; and providing, by the decoupling queue, the queued prediction information to the main front end logic section of the microprocessor; wherein the micro BTB includes a graph and a content addressable memory (CAM), the method further comprising: fetching, by the main front end logic section of the microprocessor, a program; representing, by one or more entries of the graph of the micro BTB, a corresponding branch inside an inner portion of the program; including, in each of the one or more entries of the graph of the micro BTB, a first link pointer to a first next branch in a taken direction, and a second link pointer to a second next branch in a not-taken direction; and including, in one or more items of the CAM of the micro BTB, a weight indicating whether a branch in a given entry of the CAM is to be inserted into the graph of the micro BTB as a new seed. 16. The computer-implemented method of claim 15 , further comprising: inserting, by the micro BTB, the branch in the given entry of the CAM into the graph as the new seed when a status bit indicates that the branch in the given entry of the CAM is not already present in the graph; and not inserting, by the micro BTB, the branch in the given entry of the CAM into the graph as the new seed when the status bit indicates that the branch in the given entry of the CAM is already present in the graph. 17. A front end of a microprocessor, comprising: a main front end logic section including a main branch target buffer (BTB); a micro BTB separate from the main BTB of the main front end logic section, configured to produce prediction information, and comprising a graph in which each graph node represents a branch, with edges that connect t

Assignees

Inventors

Classifications

  • G06F9/3806Primary

    using address prediction, e.g. return stack, branch history buffer · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10402200B2 cover?
Embodiments include a micro BTB, which can predict up to two branches per cycle, every cycle, with zero bubble insertion on either a taken or not taken prediction, thereby significantly improving performance and reducing power consumption of a microprocessor. A front end of a microprocessor can include a main front end logic section having a main BTB, a micro BTB to produce prediction informati…
Who is the assignee on this patent?
Dundas James David, Zuraski Jr Gerald David, Snyder Timothy Russell, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F9/3806. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).