Who is the assignee on this patent?

Shanghai Zhaoxin Semiconductor Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06F9/3802. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 02 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Microprocessor with multi-step ahead branch predictor and having a fetch-target queue between the branch predictor and instruction cache

US11403103B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11403103-B2
Application number	US-202017069217-A
Country	US
Kind code	B2
Filing date	Oct 13, 2020
Priority date	Apr 14, 2020
Publication date	Aug 2, 2022
Grant date	Aug 2, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The branch predictor performs branch prediction for N instruction addresses in parallel in the same cycle, wherein N is an integer greater than 1. In the current cycle, the branch predictor finishes branch prediction for N instruction addresses in parallel and, among the N instruction addresses with finished branch prediction, those that are not bypassed and do not overlap previously-predicted instruction addresses are pushed into the fetch-target queue, to be read out later as an instruction-fetching address for the instruction cache. The previously-predicted instruction addresses are pushed into the fetch-target queue in a previous cycle.

First claim

Opening claim text (preview).

What is claimed is: 1. A microprocessor, comprising: an instruction cache, operated according to an instruction-fetching address for instruction fetching; a branch predictor, performing branch prediction for N instruction addresses in parallel in the same time, wherein N is an integer greater than 1; and a fetch-target queue, coupled between the branch predictor and the instruction cache, wherein: in a current cycle, the branch predictor finishes branch prediction for the N instruction addresses in parallel and, among the N instruction addresses with finished branch prediction, those that are not bypassed and which do not overlap previously-predicted instruction addresses are pushed into the fetch-target queue, to be read out later as the instruction-fetching address for the instruction cache; the previously-predicted instruction addresses are pushed into the fetch-target queue in a previous cycle; in the current cycle, when the branch predictor predicts that in N chunks indicated by the N instruction addresses with finished branch prediction, a branch is predicted to be taken, and the taken branch is called by a branch instruction across two adjacent chunks, an instruction address indicating a second chunk next to a first chunk corresponding to the branch instruction calling the taken branch is pushed into the fetch-target queue, to be read out later as the instruction-fetching address for the instruction cache; N is 3; each chunk indicated by each instruction address is M bytes, M is a number; in the current cycle, the branch predictor finishes branch prediction for instruction addresses PC, PC+M, and PC+2*M; in a first setting, there is one overlapping instruction address between branch prediction finished in the current cycle and branch prediction finished in the previous cycle; in the first setting, when the instruction address PC is not bypassed and has not been pushed into the fetch-target queue in the previous cycle: the fetch-target queue provides a first entry to store the instruction address PC; the fetch-target queue provides a second entry to store the instruction address PC+M when no branch is predicted to be taken in a chunk indicated by the instruction address PC, or when a branch is predicted to be taken in the chunk indicated by the instruction address PC and the taken branch is called by a branch instruction across two adjacent chunks; the fetch-target queue provides a third entry to store the instruction address PC+2*M when no branch is predicted to be taken in two chunks indicated by the instruction addresses PC and PC+M, or when no branch is predicted to be taken in the chunk indicated by the instruction address PC, a branch is predicted to be taken in the chunk indicated by the instruction address PC+M, and the taken branch is called by a branch instruction across two adjacent chunks; and the fetch-target queue provides a fourth entry to store an instruction address PC+3*M when no branch is predicted to be taken in the two chunks indicated by the instruction addresses PC and PC+M, a branch is predicted to be taken in a chunk indicated by the instruction address PC+2*M, and the taken branch is called by a branch instruction across two adjacent chunks. 2. The microprocessor as claimed in claim 1 , wherein: the branch predictor involves multiple stages of first pipeline operations; and the instruction cache involves multiple stages of second pipeline operations. 3. The microprocessor as claimed in claim 1 , wherein in the first setting, when the instruction addresses PC, PC+M are not bypassed and the instruction address PC has been pushed into the fetch-target queue in the previous cycle: the fetch-target queue provides a first entry to store the instruction address PC+M; the fetch-target queue provides a second entry to store the instruction address PC+2*M when no branch is predicted to be taken in a chunk indicated by the instruction address PC+M, or when a branch is predicted to be taken in the chunk indicated by the instruction address PC+M and the taken branch is called by a branch instruction across two adjacent chunks; the fetch-target queue provides a third entry to store an instruction address PC+3*M when no branch is predicted to be taken in the chunk indicated by the instruction address PC+M, a branch is predicted to be taken in a chunk indicated by the instruction address PC+2*M, and the taken branch is called by a branch instruction across two adjacent chunks. 4. The microprocessor as claimed in claim 1 , wherein: in a second setting, there are two overlapping instruction addresses between the branch prediction finished in the current cycle and the branch prediction finished in the previous cycle. 5. The microprocessor as claimed in claim 4 , wherein in the second setting, when the instruction address PC is not bypassed and has not been pushed into the fetch-target queue in the previous cycle: the fetch-target queue provides a first entry to store the instruction address PC; the fetch-target queue provides a second entry to store the instruction address PC+M when no branch is predicted to be taken in a chunk indicated by the instruction address PC, or when a branch is predicted to be taken in the chunk indicated by the instruction address PC and the taken branch is called by a branch instruction across two adjacent chunks; the fetch-target queue provides a third entry to store the instruction address PC+2*M when no branch is predicted to be taken in two chunks indicated by the instruction addresses PC and PC+M, or when no branch is predicted to be taken in the chunk indicated by the instruction address PC, a branch is predicted to be taken in the chunk indicated by the instruction address PC+M, and the taken branch is called by a branch instruction across two adjacent chunks; and the fetch-target queue provides a fourth entry to store an instruction address PC+3*M when no branch is predicted to be taken in the two chunks indicated by the instruction addresses PC and PC+M, a branch is predicted to be taken in a chunk indicated by the instruction address PC+2*M, and the taken branch is called by a branch instruction across two adjacent chunks. 6. The microprocessor as claimed in claim 4 , wherein in the second setting, when the instruction addresses PC, PC+M are not bypassed and have been pushed into the fetch-target queue in the previous cycle: the fetch-target queue provides a first entry to store the instruction address PC+2*M; the fetch-target queue provides a second entry to store an instruction address PC+3*M when a branch is predicted to be taken in the chunk indicated by the instruction address PC+2*M and the taken branch is called by a branch instruction across two adjacent chunks. 7. The microprocessor as claimed in claim 1 , wherein: in a third setting, there are no overlapping instruction addresses between the branch prediction finished in the current cycle and the branch prediction finished in the previous cycle. 8. The microprocessor as claimed in claim 7 , wherein in the third setting, when the instruction address PC is not bypassed: the fetch-target queue provides a first entry to store the instruction address PC; the fetch-target queue provides a second entry to store the instruction address PC+M when no branch is predicted to be taken in a chunk indicated by the instruction address PC, or when a branch is predicted to be taken in the chunk indicated by the instruction address PC and the taken branch is called by a branch instruction across two adjacent chunks; the fetch-target queue provides a third entry to store the instruction address PC+2*M when no branch is predicted to be taken in two chunks indicated by the instruction addresses PC and PC+M, or when no branch is pred

Assignees

Shanghai Zhaoxin Semiconductor Co Ltd

Inventors

Classifications

G06F9/3861
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title
G06F9/3885
using a plurality of independent parallel functional units · CPC title
G06F9/30047
Prefetch instructions; cache control instructions · CPC title
G06F12/0875
with dedicated cache, e.g. instruction or stack · CPC title
G06F9/3808
for instruction reuse, e.g. trace cache, branch target cache · CPC title

Patent family

Related publications grouped by family.

View patent family 78006234

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11403103B2 cover?: A microprocessor is shown, in which a branch predictor and an instruction cache are decoupled by a fetch-target queue (FTQ). The branch predictor performs branch prediction for N instruction addresses in parallel in the same cycle, wherein N is an integer greater than 1. In the current cycle, the branch predictor finishes branch prediction for N instruction addresses in parallel and, among the …
Who is the assignee on this patent?: Shanghai Zhaoxin Semiconductor Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F9/3802. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 02 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).