What technology area does this patent fall under?

Primary CPC classification G06F9/3851. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Reconfigurable parallel execution and load-store slice processor

US2016202989A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016202989-A1
Application number	US-201514594716-A
Country	US
Kind code	A1
Filing date	Jan 12, 2015
Priority date	Jan 12, 2015
Publication date	Jul 14, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.

First claim

Opening claim text (preview).

1 . A processor core, comprising: a plurality of dispatch queues for receiving instructions of a corresponding plurality of instruction streams; a plurality of parallel instruction execution slices for executing the plurality of instruction streams in parallel; a dispatch routing network for routing the output of the dispatch queues to the instruction execution slices; a dispatch control logic that dispatches the instructions of the plurality of instruction streams via the dispatch routing network to issue queues of the plurality of parallel instruction execution slices; and a mode control logic, responsive to a mode control signal for reconfiguring a relationship between the plurality of parallel instruction execution slices such that in a first configuration corresponding to a first state of the mode control signal, at least two of the plurality of parallel instruction execution slices are independently operable for executing at least two of the plurality of instruction streams, and wherein in a second configuration corresponding to a second state of the mode control signal the at least two parallel instruction execution slices are linked for executing a single one of the plurality of instruction streams. 2 . The processor core of claim 1 , further comprising: a plurality of cache slices containing mutually-exclusive segments of a lowest-order level of cache memory; and a plurality of load-store slices coupling the plurality of cache slices to the plurality of parallel execution slices for controlling access by the plurality of parallel instruction execution slices to the cache slices, wherein individual ones of the load-store slices are coupled to the at least two parallel execution slices to exchange data with the at least two parallel execution slices, independent of whether the at least two parallel execution slices are in the first configuration or in the second configuration. 3 . The processor core of claim 2 , wherein the load-store units and the cache slices are responsive to the mode control signal such that in the first configuration corresponding to the first state of the mode control signal, at least two of the cache slices are separately partitioned between the at least two parallel instruction execution slices to appear as multiple smaller cache memories with contiguous cache lines, and wherein in the second configuration corresponding to the second state of the mode control signal, the cache slices are combined to appear as larger cache memory that are shared by the at least two parallel instruction execution slices. 4 . The processor core of claim 1 , wherein in the first configuration corresponding to the first state of the mode control signal, the at least two parallel instruction execution slices separately execute first instructions of a first operand width and first operator width of the at least two instruction streams, and wherein in the second configuration corresponding to the second state of the mode control signal the at least two parallel instruction execution slices are linked for executing second instructions of a second operand width that is a multiple of the first operand width or second operator width that is a multiple of the second operator width, the second instructions being instructions of the single instruction stream. 5 . The processor core of claim 4 , wherein the dispatch control logic is responsive to the mode control signal to, in the first configuration corresponding to the first state of the mode control signal, dispatch the first instructions of a first one of the at least two instruction streams to a first one of the at least two parallel instruction execution slices and dispatch the first instructions of a second one of the at least two instruction streams to a second one of the at least two parallel instruction execution slices, and wherein in the second configuration corresponding to the second state of the mode control signal, dispatch the second instructions to one or both of the at least two parallel instruction execution slices as a combined super-slice. 6 . The processor core of claim 1 , further comprising: one or more networks coupling the plurality of execution slices for exchanging results of execution of the plurality of instruction streams; and one or more switches for isolating individual ones of the one or more networks to partition the one or more networks into segments corresponding to sub-groups of the plurality of execution slices. 7 . The processor core of claim 2 , wherein the plurality of parallel instruction execution slices are organized into two or more clusters, and wherein the cache slices are interleave mapped to corresponding different ones of the two or more clusters. 8 - 14 . (canceled) 15 . A computer system, comprising: at least one processor core for executing program instructions of a corresponding plurality of instruction streams; and a memory coupled to the processor core for storing the program instructions, wherein the at least one processor core comprises a plurality of parallel instruction execution slices for executing the plurality of instruction streams in parallel, a dispatch routing network for routing the output of the dispatch queues to the instruction execution slices, a dispatch control logic that dispatches the instructions of the plurality of instruction streams via the dispatch routing network to issue queues of the plurality of parallel instruction execution slices, and a mode control logic, responsive to a mode control signal for reconfiguring a relationship between the plurality of parallel instruction execution slices such that in a first configuration corresponding to a first state of the mode control signal, at least two of the plurality of parallel instruction execution slices are independently operable for executing at least two of the plurality of instruction streams, and wherein in a second configuration corresponding to a second state of the mode control signal the at least two parallel instruction execution slices are linked for executing a single one of the plurality of instruction streams. 16 . The computer system of claim 15 , wherein the processor core further comprises: a plurality of cache slices containing mutually-exclusive segments of a lowest-order level of cache memory; and a plurality of load-store slices coupling the plurality of cache slices to the plurality of parallel execution slices for controlling access by the plurality of parallel instruction execution slices to the cache slices, wherein individual ones of the load-store slices are coupled to the at least two parallel execution slices to exchange data with the at least two parallel execution slices, independent of whether the at least two parallel execution slices are in the first configuration or in the second configuration. 17 . The computer system of claim 15 , wherein in the first configuration corresponding to the first state of the mode control signal, the at least two parallel instruction execution slices separately execute first instructions of a first operand width of the at least two instruction streams, and wherein in the second configuration corresponding to the second state of the mode control signal the at least two parallel instruction execution slices are linked for executing second instructions of a second operand width that is a multiple of the first operand width or second operator width that is a multiple of the second operator width, the second instructions being instructions of the single instruction stream. 18 . The computer system of claim 17 , wherein the dispatch control logic is responsive to the mode control signal to, in the first configuration corresponding

Assignees

Inventors

Classifications

G06F2212/452
Instruction code · CPC title
G06F9/30145
Instruction analysis, e.g. decoding, instruction word fields · CPC title
G06F9/3851Primary
from multiple instruction streams, e.g. multistreaming · CPC title
G06F12/0875
with dedicated cache, e.g. instruction or stack · CPC title
G06F9/3888
controlled by a single instruction for multiple threads [SIMT] in parallel · CPC title

Patent family

Related publications grouped by family.

View patent family 56367639

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016202989A1 cover?: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execu…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F9/3851. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).