Prefetch stream allocation for multithreading systems

US10671394B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10671394-B2
Application numberUS-201816177273-A
CountryUS
Kind codeB2
Filing dateOct 31, 2018
Priority dateOct 31, 2018
Publication dateJun 2, 2020
Grant dateJun 2, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer system for prefetching data in a multithreading environment includes a processor having a prefetching engine and a stride detector. The processor is configured to perform requesting data associated with a first thread of a plurality of threads, and prefetching requested data by the prefetching engine, where prefetching includes allocating a prefetch stream in response to an occurrence of a cache miss. The processor performs detecting each cache miss, and based on detecting the cache miss, monitoring the prefetching engine to detect subsequent cache misses and to detect one or more events related to allocations performed by the prefetching engine. The processor further performs, based on the stride detector detecting a selected number of events, directing the stride detector to switch from the first thread to a second thread by ignoring stride-1 allocations for the first thread and evaluating stride-1 allocations for potential strided accesses on the second thread.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system for prefetching data in a multithreading environment, the system comprising: a memory including a local cache; a processor operatively connected to the local cache, the processor including a prefetching engine and a stride detector, the processor configured to perform: requesting, by the processor, data associated with a first thread of a plurality of threads; prefetching requested data from a cache structure into the local cache by the prefetching engine, wherein prefetching includes allocating a prefetch stream in response to an occurrence of a cache miss; detecting, by the stride detector, each cache miss; based on detecting the cache miss, monitoring the prefetching engine to detect subsequent cache misses and to detect one or more events related to allocations performed by the prefetching engine; and based on the stride detector detecting a selected number of events, directing the stride detector to switch from the first thread to a second thread by ignoring stride-1 allocations for the first thread and evaluating stride-1 allocations for potential strided accesses on the second thread. 2. The computer system of claim 1 , wherein the prefetching engine is configured to direct the stride detector to switch between each of the plurality of threads in order according to a round-robin schedule. 3. The computer system of claim 1 , wherein the selected number of events is a selected number of prefetch stream allocations. 4. The computer system of claim 1 , wherein the prefetching engine is configured to support both stride-1 prefetching and stride-N prefetching. 5. The computer system of claim 4 , wherein the selected number of events is a number of prefetch stream allocations without a successful stride-N prefetch stream allocation. 6. The computer system of claim 1 , wherein the stride detector is connected to a stride detection table, the stride detection table including an entry for each cache miss detected by the stride detector during processing of the first thread, wherein a respective entry includes a subset of a cache structure address associated with a respective cache miss and a stride value or values relative to an address of an existing entry or entries, the stride value or values indicating a number of cachelines in the address structure between the address associated with the respective cache miss and the address in the existing entry or entries. 7. The computer system of claim 6 , wherein the stride detector is configured to, in response to detecting a cache miss, add a new entry to the stride detection table, and calculate a stride value or values for the new entry in relation to the existing entry or entries in the stride detection table. 8. The computer system of claim 1 , wherein the processor is configured to perform simultaneous multithreading (SMT). 9. A method of prefetching data in a multithreading environment, the method comprising: requesting, by a processor including a local cache, data associated with a first thread of a plurality of threads; prefetching requested data from a cache structure into the local cache by a prefetching engine, wherein prefetching includes allocating a prefetch stream in response to an occurrence of a cache miss; detecting, by a stride detector, the cache miss; based on detecting the cache miss, monitoring the prefetching engine to detect subsequent cache misses and to detect one or more events related to allocations performed by the prefetching engine; and based on the stride detector detecting a selected number of events, directing the stride detector to switch from the first thread to a second thread by ignoring stride-1 allocations for the first thread and evaluating stride-1 allocations for potential strided accesses on the second thread. 10. The method of claim 9 , wherein the prefetching engine is configured to direct the stride detector to switch between each of the plurality of threads in order according to a round-robin schedule. 11. The method of claim 9 , wherein the selected number of events is a selected number of prefetch stream allocations. 12. The method of claim 9 , wherein the prefetching engine is configured to support both stride-1 prefetching and stride-N prefetching, and the selected number of events is a number of prefetch stream allocations without a successful stride-N prefetch stream allocation. 13. The method of claim 9 , wherein the stride detector is connected to a stride detection table, the stride detection table including an entry for each cache miss detected by the stride detector during processing of the first thread, wherein a respective entry includes a subset of a cache structure address associated with a respective cache miss and a stride value or values relative to an address of an existing entry or entries, the stride value or values indicating a number of cachelines in the address structure between the address associated with the respective cache miss and the address in the existing entry or entries. 14. The method of claim 13 , wherein the stride detector is configured to, in response to detecting a cache miss, add a new entry to the stride detection table, and calculate a stride value or values for the new entry in relation to the existing entry or entries in the stride detection table. 15. The method of claim 9 , wherein the processor is configured to perform simultaneous multithreading (SMT). 16. A computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processing system to perform: requesting, by a processor including a local cache, data associated with a first thread of a plurality of threads; prefetching requested data from a cache structure into the local cache by a prefetching engine, wherein prefetching includes allocating a prefetch stream in response to an occurrence of a cache miss; detecting, by a stride detector, the cache miss; based on detecting the cache miss, monitoring the prefetching engine to detect subsequent cache misses and to detect one or more events related to allocations performed by the prefetching engine; and based on the stride detector detecting a selected number of events, directing the stride detector to switch from the first thread to a second thread by ignoring stride-1 allocations for the first thread and evaluating stride-1 allocations for potential strided accesses on the second thread. 17. The computer program product of claim 16 , wherein the stride detector is configured to direct the prefetching engine to switch between each of the plurality of threads in order according to a round-robin schedule. 18. The computer program product of claim 16 , wherein the selected number of events is a selected number of prefetch stream allocations. 19. The computer program product of claim 16 , wherein the prefetching engine is configured to support both stride-1 prefetching and stride-N prefetching, and the selected number of events is a number of prefetch stream allocations without a successful stride-N prefetch stream allocation. 20. The computer program product of claim 16 , wherein the stride detector is connected to a stride detection table, the stride detection table including a respective entry for each cache miss detected by the stride detector during processing of a thread, the respective entry including a subset of an address in the cache structure associated with a respective cache miss and a stride value or values re

Assignees

Inventors

Classifications

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • the resource being the memory · CPC title

  • Operand prefetching (cache prefetching G06F12/0862) · CPC title

  • Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

  • G06F9/3455Primary

    using stride · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10671394B2 cover?
A computer system for prefetching data in a multithreading environment includes a processor having a prefetching engine and a stride detector. The processor is configured to perform requesting data associated with a first thread of a plurality of threads, and prefetching requested data by the prefetching engine, where prefetching includes allocating a prefetch stream in response to an occurrenc…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/3455. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 02 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).