Who is the assignee on this patent?

Barry Edwin Franklin, Pechanek Gerald George, Marchand Patrick R, and 1 more

What technology area does this patent fall under?

Primary CPC classification G06F9/30145. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 03 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Methods and apparatus for adapting pipeline stage latency based on instruction type

US9329866B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9329866-B2
Application number	US-201313780746-A
Country	US
Kind code	B2
Filing date	Feb 28, 2013
Priority date	Mar 22, 2004
Publication date	May 3, 2016
Grant date	May 3, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Processor pipeline controlling techniques are described which take advantage of the variation in critical path lengths of different instructions to achieve increased performance. By examining a processor's instruction set and execution unit implementation's critical timing paths, instructions are classified into speed classes. Based on these speed classes, one pipeline is presented where hold signals are used to dynamically control the pipeline based on the instruction class in execution. An alternative pipeline supporting multiple classes of instructions is presented where the pipeline clocking is dynamically changed as a result of decoded instruction class signals. A single pass synthesis methodology for multi-class execution stage logic is also described. For dynamic class variable pipeline processors, the mix of instructions can have a great effect on processor performance and power utilization since both can vary by the program mix of instruction classes. Application code can be given new degrees of optimization freedom where instruction class and the mix of instructions can be chosen based on performance and power requirements.

First claim

Opening claim text (preview).

We claim: 1. A method for changing execution latency for performance and power optimization of an instruction class adaptable pipeline processor, the method comprising: creating, by use of a computer, an application program for the instruction class adaptable pipeline processor, the application program containing class one instructions each executable with a first latency in a class one pipeline and class two instructions each executable with a second latency in a class two pipeline, wherein the first latency is shorter than the second latency; changing an encoding of an original class one instruction of the class one instructions to a second encoded class one instruction that specifies a function of the original class one instruction is to be executed with the second latency to minimize power use while still meeting performance requirements and to create a modified application program; and executing the modified application program on the instruction class adaptable pipeline processor, wherein the second encoded class one instruction is executed and the original class one instruction is not executed. 2. The method of claim 1 further comprising: appropriately programming a programmable clock gating mode to cause a specifiable majority of the instructions of the instruction class adaptable pipeline processor to execute at a longer latency than the second latency associated with the class two instructions, to further minimize power use while still meeting performance requirements of the modified application program. 3. The method of claim 1 , wherein the class one pipeline comprises: a first fetch stage; a first decode stage; and a first execute stage, wherein each stage is executable at the first latency or the second latency. 4. The method of claim 1 , wherein the class two pipeline comprises: a second fetch stage; a second decode stage; and a second execute stage, wherein the second fetch stage and the second decode stage are executable at the first latency or the second latency and the second execute stage is executable at the second latency. 5. A method for changing execution latency for performance and power optimization of an instruction class adaptable pipeline processor, the method comprising: creating, by use of a computer, an application program for the instruction class adaptable pipeline processor, the application program containing class one instructions each executable with a first latency in a class one pipeline and class two instructions each executable with a second latency in a class two pipeline, wherein the first latency is shorter than the second latency; changing an encoding of an original class one instruction of the class one instructions to a second encoded class one instruction that specifies a function of the original class one instruction is to be executed with the second latency to minimize power use while still meeting performance requirements and to create a modified application program; and executing the modified application program on the instruction class adaptable pipeline processor, wherein the second encoded class one instruction is executed and the original class one instruction is not executed, wherein the application program is initially created with each instruction specified at its highest frequency class. 6. The method of claim 5 further comprising: issuing a class one instruction in parallel with a class two instruction; and executing both the class one instruction and the class two instruction in parallel with the second latency. 7. The method of claim 5 further comprising: issuing a class one instruction in parallel with another class one instruction; and executing both of the class one instructions in parallel with the first latency. 8. An apparatus providing performance and power optimization capabilities to an instruction class adaptable pipeline processor, the apparatus comprising: a program storage unit storing a plurality of class one instructions each executable at a first latency and a plurality of class two instructions each executable at a second latency, wherein the first latency is shorter than the second latency; a fetch stage and a decode stage for fetching an instruction from the program storage unit and decoding the fetched instruction to determine an instruction class indication, wherein the fetch stage and the decode stage operate at the first latency or the second latency and the decoded instruction being a class one instruction or a class two instruction; and an adaptable execution stage for execution of the decoded instruction at the first latency or at the second latency according to the instruction class indication, wherein under program control, a class encoding field in one or more of the plurality of class one instructions is changed from a class one encoding to a class two encoding to reduce power utilization when the program is executed. 9. The apparatus of claim 8 further comprising: an adaptable pipeline control unit responsive to the instruction class indication for adapting a duration of latency of the adaptable execution stage to the first latency for a class one instruction or to the second latency for a class two instruction. 10. The apparatus of claim 9 , wherein the adaptable pipeline control unit responsive to the instruction class indication for adapting a duration of latency of the fetch stage and the decode stage to the first latency for a fetched and decoded class one instruction or to the second latency for a fetched and decoded class two instruction. 11. The apparatus of claim 8 , wherein the plurality of class one instructions and the plurality of class two instructions are each encoded with a class encoding field separate from an instruction opcode that defines a function of each of the class one and class two instructions, the class encoding field encoded with a highest frequency class according to the opcode. 12. An apparatus providing performance and power optimization capabilities to an instruction class adaptable pipeline processor, the apparatus comprising: a program storage unit storing a plurality of class one instructions each executable at a first latency and a plurality of class two instructions each executable at a second latency, wherein the first latency is shorter than the second latency; a fetch stage and a decode stage for fetching an instruction from the program storage unit and decoding the fetched instruction to determine an instruction class indication, wherein the fetch stage and the decode stage operate at the first latency or the second latency and the decoded instruction being a class one instruction or a class two instruction; an adaptable execution stage for execution of the decoded instruction at the first latency or at the second latency according to the instruction class indication, wherein under program control, a class encoding field in one or more of the plurality of class one instructions is changed from a class one encoding to a class two encoding to reduce power utilization when the program is executed, wherein the adaptable execution stage comprises: class one execution logic having a worst-case signal propagation time that is less than or equal to a first class time period; and class two execution logic having a worst-case signal propagation time that is greater than the first class time period and less than or equal to a second class time period. 13. The apparatus of claim 12 , wherein the first class time period sets the operating frequency for the first class instructions and the second class time period sets the operating frequency for the second class instructions. 14. The a

Assignees

Inventors

Classifications

G06F9/3869
Implementation aspects, e.g. pipeline latches; pipeline synchronisation and clocking · CPC title
G06F9/30145Primary
Instruction analysis, e.g. decoding, instruction word fields · CPC title
G06F9/3885
using a plurality of independent parallel functional units · CPC title
G06F9/30079Primary
Pipeline control instructions, e.g. multicycle NOP · CPC title

Patent family

Related publications grouped by family.

View patent family 42797881

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9329866B2 cover?: Processor pipeline controlling techniques are described which take advantage of the variation in critical path lengths of different instructions to achieve increased performance. By examining a processor's instruction set and execution unit implementation's critical timing paths, instructions are classified into speed classes. Based on these speed classes, one pipeline is presented where hold s…
Who is the assignee on this patent?: Barry Edwin Franklin, Pechanek Gerald George, Marchand Patrick R, and 1 more
What technology area does this patent fall under?: Primary CPC classification G06F9/30145. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 03 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).