Parallel slice processor with dynamic instruction stream mapping
US-2015324206-A1 · Nov 12, 2015 · US
US10037211B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10037211-B2 |
| Application number | US-201615077015-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 22, 2016 |
| Priority date | Mar 22, 2016 |
| Publication date | Jul 31, 2018 |
| Grant date | Jul 31, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Operation of a multi-slice processor that includes a plurality of execution slices and a plurality of load/store slices, where each load/store slice includes a load miss queue and a load reorder queue, includes: receiving, at a load reorder queue, a load instruction requesting data; responsive to the data not being stored in a data cache, determining whether a previous load instruction is pending a fetch of a cache line comprising the data; if the cache line does not comprise the data, allocating an entry for the load instruction in the load miss queue; and if the cache line does comprise the data: merging, in the load reorder queue, the load instruction with an entry for the previous load instruction.
Opening claim text (preview).
What is claimed is: 1. A method of operation of a multi-slice processor, the multi-slice processor including a plurality of execution slices and a plurality of load/store slices, each load/store slice comprising a load miss queue and a load reorder queue, the method comprising: receiving, at the load reorder queue, a load instruction requesting data; responsive to the data not being stored in a data cache, determining whether a previous load instruction is pending a fetch of a cache line comprising the data; if the cache line does not comprise the data, allocating an entry for the load instruction in the load miss queue; if the cache line does comprise the data: merging, in the load reorder queue, the load instruction with an entry for the previous load instruction; determining that the cache line is available; and in a first cycle: propagating data for the previous load instruction to a results bus without accessing the data cache for the cache line; and storing the cache line in the data cache; and in a cycle after the first cycle: propagating the data for the load instruction to a results bus, wherein the data is retrieved from the data cache. 2. The method of claim 1 , wherein the load reorder queue comprises a plurality of entries, and wherein each of the plurality of entries of the load reorder queue comprises a field for data indicating a merge with another entry in the plurality of entries of the load reorder queue. 3. The method of claim 2 , wherein multiple entries of the plurality of entries are merged, and wherein the method further comprises: determining that the previous load instruction is the oldest load instruction among multiple merge entries. 4. The method of claim 3 , wherein propagating the data for the previous load instruction to the results bus without accessing the data cache for the cache line is in dependence upon determining that the previous load instruction is the oldest load instruction among the multiple entries. 5. The method of claim 1 , wherein if the cache line does not comprise the data, an entry for the load instruction is not allocated within the load miss queue. 6. The method of claim 1 , wherein merging the entry comprises modifying an entry in the load reorder queue for the previous load instruction indicating a merge with the load instruction. 7. A multi-slice processor comprising: a plurality of execution slices and a plurality of load/store slices, each load/store slice comprising a load miss queue and a load reorder queue, wherein the multi-slice processor is configured to carry out: receiving, at the load reorder queue, a load instruction requesting data; responsive to the data not being stored in a data cache, determining whether a previous load instruction is pending a fetch of a cache line comprising the data; if the cache line does not comprise the data, allocating an entry for the load instruction in the load miss queue; if the cache line does comprise the data: merging, in the load reorder queue, the load instruction with an entry for the previous load instruction; determining that the cache line is available; and in a first cycle: propagating data for the previous load instruction to a results bus without accessing the data cache for the cache line; and storing the cache line in the data cache; and in a cycle after the first cycle: propagating the data for the load instruction to a results bus, wherein the data is retrieved from the data cache. 8. The multi-slice processor of claim 7 , wherein the load reorder queue comprises a plurality of entries, and wherein each of the plurality of entries of the load reorder queue comprises a field for data indicating a merge with another entry in the plurality of entries of the load reorder queue. 9. The multi-slice processor of claim 8 , wherein multiple entries of the plurality of entries are merged, and wherein the multi-slice processor is further configured to carry out: determining that the previous load instruction is the oldest load instruction among multiple merge entries. 10. The multi-slice processor of claim 9 , wherein propagating the data for the previous load instruction to the results bus without accessing the data cache for the cache line is in dependence upon determining that the previous load instruction is the oldest load instruction among the multiple entries. 11. The multi-slice processor of claim 7 , wherein if the cache line does not comprise the data, an entry for the load instruction is not allocated within the load miss queue. 12. The multi-slice processor of claim 7 wherein merging the entry comprises modifying an entry in the load reorder queue for the previous load instruction indicating a merge with the load instruction. 13. An apparatus comprising: a multi-slice processor, the multi-slice processor comprising: a plurality of execution slices and a plurality of load/store slices, each load/store slice comprising a load miss queue and a load reorder queue, wherein the multi-slice processor is configured to carry out: receiving, at the load reorder queue, a load instruction requesting data; responsive to the data not being stored in a data cache, determining whether a previous load instruction is pending a fetch of a cache line comprising the data; if the cache line does not comprise the data, allocating an entry for the load instruction in the load miss queue; if the cache line does comprise the data: merging, in the load reorder queue, the load instruction with an entry for the previous load instructions; determining that the cache line is available; and in a first cycle: propagating data for the previous load instruction to a results bus without accessing the data cache for the cache line; and storing the cache line in the data cache; and in a cycle after the first cycle: propagating the data for the load instruction to a results bus, wherein the data is retrieved from the data cache. 14. The apparatus of claim 13 , wherein the load reorder queue comprises a plurality of entries, and wherein each of the plurality of entries of the load reorder queue comprises a field for data indicating a merge with another entry in the plurality of entries of the load reorder queue. 15. The apparatus of claim 14 , wherein multiple entries of the plurality of entries are merged, and wherein the multi-slice processor is further configured to carry out: determining that the previous load instruction is the oldest load instruction among multiple merge entries. 16. The apparatus of claim 15 , wherein propagating the data for the previous load instruction to the results bus without accessing the data cache for the cache line is in dependence upon determining that the previous load instruction is the oldest load instruction among the multiple entries. 17. The apparatus of claim 13 , wherein if the cache line does not comprise the data, an entry for the load instruction is not allocated within the load miss queue.
organised in groups of units sharing resources, e.g. clusters · CPC title
Operand accessing · CPC title
Electrical coupling · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.