Handling unaligned load operations in a multi-slice computer processor
US-2018300136-A1 · Oct 18, 2018 · US
US10496406B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10496406-B2 |
| Application number | US-201816014291-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 21, 2018 |
| Priority date | Dec 11, 2015 |
| Publication date | Dec 3, 2019 |
| Grant date | Dec 3, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Handling unaligned load operations, including: receiving a request to load data stored within a range of addresses; determining that the range of addresses includes addresses associated with a plurality of caches, wherein each of the plurality of caches are associated with a distinct processor slice; issuing, to each distinct processor slice, a request to load data stored within a cache associated with the distinct processor slice, wherein the request to load data stored within the cache associated with the distinct processor slice includes a portion of the range of addresses; executing, by each distinct processor slice, the request to load data stored within the cache associated with the distinct processor slice; and receiving, over a plurality of data communications busses, execution results from each distinct processor slice, wherein each data communications busses is associated with one of the distinct processor slices.
Opening claim text (preview).
What is claimed is: 1. A multi-slice computer processor, the multi-slice computer processor configured for: receiving a load operation request to retrieve data stored within a range of addresses spanning a plurality of distinct processor slices, wherein the range of addresses comprises memory locations associated with a plurality of caches each corresponding to a distinct processor slice, and wherein the range of addresses comprises a first portion that references a memory location in a first cache of the plurality of caches and a second portion that references a memory location in a second cache of the plurality of caches; receiving, over a plurality of data communications busses, execution results of the load operation from each distinct processor slice, wherein each of the plurality of data communications busses is associated with one of the distinct processor slices; and assembling, from the execution results from each distinct processor slice, the data stored within the range of addresses, including: identifying a portion of each execution result that includes data stored within the range of addresses; and combining the portion of each execution result that includes data stored within the range into a single result. 2. The multi-slice computer processor of claim 1 further configured for formatting, by each processor slice, the execution results. 3. The multi-slice computer processor of claim 2 wherein formatting, by each processor slice, the execution results further comprises: identifying a portion of the execution results that includes data contained in the range of addresses; determining whether the portion of the execution results that includes data contained in the range of addresses represents a beginning portion of the range of addresses or an ending portion of the range of addresses; and shifting, in dependence upon whether the portion of the execution results that includes data contained in the range of addresses represents a beginning portion of the range of addresses or an ending portion of the range of addresses, the portion of the execution results that includes data contained in the range of addresses. 4. The multi-slice computer processor of claim 1 further configured for predicting when the data stored within the range of addresses will be loaded into a target memory location. 5. The multi-slice computer processor of claim 4 further configured for: identifying one or more operations that are dependent upon completion of the request to load data stored within the range of addresses; and issuing, in dependence upon when the data stored within the range of addresses is predicted to be loaded into the target memory location, the one or more operations that are dependent upon completion of the request to load data stored within the range of addresses. 6. A computing system, the computing system including a multi-slice computer processor, the multi-slice computer processor configured for: receiving a load operation request to retrieve data stored within a range of addresses spanning a plurality of distinct processor slices, wherein the range of addresses comprises memory locations associated with a plurality of caches each corresponding to a distinct processor slice, and wherein the range of addresses comprises a first portion that references a memory location in a first cache of the plurality of caches and a second portion that references a memory location in a second cache of the plurality of caches; receiving, over a plurality of data communications busses, execution results of the load operation from each distinct processor slice, wherein each of the plurality of data communications busses is associated with one of the distinct processor slices; and assembling, from the execution results from each distinct processor slice, the data stored within the range of addresses, including: identifying a portion of each execution result that includes data stored within the range of addresses; and combining the portion of each execution result that includes data stored within the range into a single result. 7. The computing system of claim 6 , wherein the multi-slice computer processor is further configured for formatting, by each processor slice, the execution results. 8. The computing system of claim 7 wherein formatting, by each processor slice, the execution results further comprises: identifying a portion of the execution results that includes data contained in the range of addresses; determining whether the portion of the execution results that includes data contained in the range of addresses represents a beginning portion of the range of addresses or an ending portion of the range of addresses; and shifting, in dependence upon whether the portion of the execution results that includes data contained in the range of addresses represents a beginning portion of the range of addresses or an ending portion of the range of addresses, the portion of the execution results that includes data contained in the range of addresses. 9. The computing system of claim 6 , wherein the multi-slice computer processor is further configured for: predicting when the data stored within the range of addresses will be loaded into a target memory location; identifying one or more operations that are dependent upon completion of the request to load data stored within the range of addresses; and issuing, in dependence upon when the data stored within the range of addresses is predicted to be loaded into the target memory location, the one or more operations that are dependent upon completion of the request to load data stored within the range of addresses.
LOAD or STORE instructions; Clear instruction · CPC title
organised in groups of units sharing resources, e.g. clusters · CPC title
Prefetch instructions; cache control instructions · CPC title
for multiprocessing or multitasking · CPC title
Details relating to cache prefetching · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.