Processor with programmable prefetcher
US-2017161195-A1 · Jun 8, 2017 · US
US10452551B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10452551-B2 |
| Application number | US-201615376242-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 12, 2016 |
| Priority date | Dec 12, 2016 |
| Publication date | Oct 22, 2019 |
| Grant date | Oct 22, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor may include a programmable memory prefetcher that includes a programmable hardware prefetch engine and a prefetch engine control register. The programmable memory prefetcher may include circuitry and may be configured to receive, during execution of an application, a first instruction for configuring the prefetch engine for prefetching multiple cache lines to be accessed in the future, at predictable locations, by the application; to store, in the prefetch engine control register, dependent on information in the first instruction, data representing an amount of prefetching to be performed, and data representing a stride distance between consecutive cache lines to be prefetched; to receive a second instruction for prefetching a single cache line whose location is identified in the second instruction; and to initiate, in response to receiving the second instruction, prefetching of multiple cache lines by the prefetch engine, to be performed in parallel with execution of the application and in accordance with the data stored in the prefetch engine control register. The prefetch engine control register may store multiple entries, each including an identifier of a given operation to prefetch multiple cache lines. An instruction may also be received to disable prefetching of multiple cache lines. The multiple cache lines may be prefetched from a last-level cache (LLC) to a mid-level cache.
Opening claim text (preview).
What is claimed is: 1. A processor, comprising: a programmable memory prefetcher comprising: a programmable hardware prefetch engine; and a prefetch engine control register; wherein the programmable memory prefetcher comprises circuitry and is configured to: receive, during execution of an application on the processor, a first instruction executable to configure the programmable hardware prefetch engine for prefetching multiple cache lines to be accessed in the future, at locations addressable in a predictable pattern, by the application; store, in the prefetch engine control register, dependent on information included in the first instruction, data representing an amount of prefetching to be performed and data representing a stride distance between consecutive cache lines to be prefetched; receive, during execution of the application, a second instruction executable to prefetch a single cache line whose location is identified by a parameter of the second instruction; and initiate, in response to receiving the second instruction, prefetching of multiple cache lines by the programmable hardware prefetch engine from a last-level cache to a mid-level cache in the processor, the prefetching to be performed in parallel with execution of the application and in accordance with the data stored in the prefetch engine control register. 2. The processor of claim 1 , wherein the first instruction comprises a write request that targets the prefetch engine control register. 3. The processor of claim 1 , wherein: the prefetch engine control register includes multiple entries, each of which includes an identifier of a respective operation to prefetch, by the programmable hardware prefetch engine, multiple cache lines; the first instruction specifies an identifier of a given operation to prefetch, by the programmable hardware prefetch engine, multiple cache lines; and the programmable memory prefetcher is further configured to: store the data representing an amount of prefetching to be performed, the data representing a stride distance between consecutive ones of the multiple cache lines to be prefetched, and the identifier of the given operation in one of the entries in the prefetch engine control register. 4. The processor of claim 1 , wherein the programmable memory prefetcher is further configured to: store, in the prefetch engine control register, dependent on information included in the first instruction, data representing a number of execution cycles for which to wait between prefetching consecutives ones of the multiple cache lines. 5. The processor of claim 1 , wherein the programmable hardware prefetch engine comprises a hardware state machine to prefetch the multiple cache lines asynchronously and in parallel with the execution of the application. 6. The processor of claim 1 , wherein the programmable memory prefetcher is further configured to: receive a third instruction executable to disable the prefetching of the multiple cache lines by the programmable hardware prefetch engine; and store, in the prefetch engine control register in response to receiving the third instruction, data indicating that the programmable hardware prefetch engine is no longer enabled to perform the prefetching of the multiple cache lines. 7. The processor of claim 1 , wherein the prefetching is to be performed by iteratively prefetching a next cache line at a distance from a previously prefetched cache line specified by the stride distance until the amount of data prefetched matches the amount of prefetching to be performed. 8. A method comprising, in a processor: receiving, during execution of an application on the processor, a first instruction for configuring a programmable hardware prefetch engine for prefetching multiple cache lines to be accessed in the future, at locations addressable in a predictable pattern, by the application; storing, in a prefetch engine control register, dependent on information included in the first instruction, data representing an amount of prefetching to be performed and data representing a stride distance between consecutive cache lines to be prefetched; receiving, during execution of the application, a second instruction for prefetching a single cache line whose location is identified by a parameter of the second instruction; and initiating, in response to receiving the second instruction, prefetching of multiple cache lines by the programmable hardware prefetch engine from a last-level cache to a mid-level cache in the processor, the prefetching to be performed in parallel with execution of the application and in accordance with the data stored in the prefetch engine control register. 9. The method of claim 8 , wherein: the prefetch engine control register includes multiple entries, each of which includes an identifier of a respective operation to prefetch, by the programmable hardware prefetch engine, multiple cache lines; the first instruction specifies an identifier of a given operation to prefetch, by the programmable hardware prefetch engine, multiple cache lines; and the method further includes: storing the data representing an amount of prefetching to be performed, the data representing a stride distance between consecutive ones of the multiple cache lines to be prefetched, and the identifier of the given operation in one of the entries, in the prefetch engine control register. 10. The method of claim 8 , further comprising: storing, in the prefetch engine control register, dependent on information included in the first instruction, data representing a number of execution cycles for which to wait between prefetching consecutives ones of the multiple cache lines. 11. The method of claim 8 , wherein: the programmable hardware prefetch engine comprises a hardware state machine; and the method further comprises prefetching, by the hardware state machine, the multiple cache lines asynchronously and in parallel with the execution of the application. 12. The method of claim 8 , further comprising: receiving a third instruction for disabling the prefetching of the multiple cache lines by the programmable hardware prefetch engine; storing, in the prefetch engine control register in response to receiving the third instruction, data indicating that the programmable hardware prefetch engine is no longer enabled to perform the prefetching of the multiple cache lines. 13. A system, comprising: a processor, comprising a programmable memory prefetcher that includes: a programmable hardware prefetch engine; and a prefetch engine control register; and a memory storing program instructions that, when executed by the processor, implement an application, the program instructions comprising: a first instruction executable to configure the programmable hardware prefetch engine for prefetching multiple cache lines to be accessed in the future, at locations addressable in a predictable pattern, by the application; and a second instruction executable to prefetch a single cache line whose location is identified by a parameter of the first instruction; wherein the programmable memory prefetcher comprises circuitry and is configured to: receive, during execution of the application, the first instruction; store, in the prefetch engine control register, dependent on information included in the first instruction, data representing an amount of prefetching to be performed and data representing a stride distance between consecutive cache lines to be prefetched; receive, during execution of the application, the second instruction; and initiate, in response to receiving the second instruction, prefetching of multiple cac
Prefetch instructions; cache control instructions · CPC title
Latency reduction · CPC title
with prefetch · CPC title
Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title
with two or more cache hierarchy levels (with multilevel cache hierarchies G06F12/0811) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.