Last-level collective hardware prefetching

US11599470B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11599470-B2
Application numberUS-201917295797-A
CountryUS
Kind codeB2
Filing dateNov 6, 2019
Priority dateNov 29, 2018
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A last-level collective hardware prefetcher (LLCHP) is described. The LLCHP is to detect a first off-chip memory access request by a first processor core of a plurality of processor cores. The LLCHP is further to determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores. The LLCHP is further to prefetch the first data and the second data based on the determination.

First claim

Opening claim text (preview).

What is claimed is: 1. A multi-core computer processor comprising: a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture; and a hardware prefetcher operatively coupled to the plurality of processor cores and to a cache, wherein the hardware prefetcher is to: detect a first off-chip memory access request by a first processor core of the plurality of processor cores; determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores; and prefetch the first data and the second data based on the determination, wherein to determine that the first data associated with the first off-chip memory access request is associated with the second data of the second processor core of the plurality of processor cores, the hardware prefetcher is to: determine that a stride entry exists for the first off-chip memory access request; and determine that a group exists for the stride entry. 2. The multi-core computer processor of claim 1 , further comprising a last-level cache operatively coupled to the plurality of processor cores and to the hardware prefetcher, wherein to prefetch the first data and the second data the hardware prefetcher is to store the first data and the second data in the last-level cache. 3. The multi-core computer processor of claim 1 , wherein to determine that the first data associated with the first off-chip memory access request is associated with the second data of the second processor core of the plurality of processor cores, the hardware prefetcher is further to determine that a confidence threshold, corresponding to a confidence level that the first data is associated with the second data, is greater than or equal to a threshold level. 4. The multi-core computer processor of claim 1 , wherein to prefetch the first data and the second data based on the determination, the hardware prefetcher is further to prefetch the entire group associated with the stride entry. 5. A multi-core computer processor comprising: a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture; and a hardware prefetcher operatively coupled to the plurality of processor cores and to a cache, wherein the hardware prefetcher is to: detect a first off-chip memory access request by a first processor core of the plurality of processor cores; determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores; prefetch the first data and the second data based on the determination; detect a second off-chip memory access request by the first processor core; determine, based on the second off-chip memory access request, that third data associated with the second off-chip memory access request is not associated with any additional data of the second processor core; and prefetch only the third data based on the determination. 6. The multi-core computer processor of claim 5 , wherein to determine that the third data associated with the second memory access request is not associated with the any additional data of the second processor core, the hardware prefetcher is to: determine that a stride entry does not exist for the memory access request; and generate the stride entry. 7. The multi-core computer processor of claim 6 , wherein to determine that the third data associated with the second memory access request is not associated with the any additional data of the second processor core, the hardware prefetcher is to determine that a group does not exist for the stride entry. 8. The multi-core computer processor of claim 5 , wherein to determine that the third data associated with the second off-chip memory access request is not associated with the any additional data of the second processor core, the hardware prefetcher is further to determine that a confidence threshold, corresponding to a confidence level that the third data is associated with the any additional data, is less than a threshold level. 9. The multi-core computer processor of claim 8 , wherein the hardware prefetcher is further to update the confidence threshold based on the determining that the confidence threshold is less than the threshold level. 10. A method of prefetching data in a multi-core computer processor, the method comprising: detecting, by a hardware prefetcher of the multi-core processor, a first off-chip memory access request by a first processor core of a plurality of processor cores of the multi-core processor; determining, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores; and prefetching the first data and the second data based on the determination, wherein determining that the first data associated with the first off-chip memory access request is associated with the second data of the second processor core of the plurality of processor cores comprises: determining that a stride entry exists for the first off-chip memory access request; and determining that a group exists for the stride entry. 11. The method of claim 10 , wherein prefetching the first data and the second data comprises storing the first data and the second data in a last-level cache of the multi-core processor. 12. The method of claim 10 , wherein determining that the first data associated with the first off-chip memory access request is associated with the second data of the second processor core of the plurality of processor cores comprises determining that a confidence threshold, corresponding to a confidence level that the first data is associated with the second data, is greater than or equal to a threshold level. 13. The method of claim 10 , wherein prefetching the first data and the second data based on the determination comprises prefetching the entire group associated with the stride entry. 14. A method of prefetching data in a multi-core computer processor, the method comprising: detecting, by a hardware prefetcher of the multi-core processor, a first off-chip memory access request by a first processor core of a plurality of processor cores of the multi-core processor; determining, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores; prefetching the first data and the second data based on the determination; detecting a second off-chip memory access request by the first processor core; determining, based on the second off-chip memory access request, that third data associated with the second off-chip memory access request is not associated with any additional data of the second processor core; and prefetching only the third data based on the determination. 15. The method of claim 14 , wherein determining that the third data associated with the second memory access request is not associated with the any additional data of the second processor core comprises: determining that a stride entry does not exist for the memory access request; and generating the stride entry. 16. The method of claim 15 , wherein determining that the third data associated with the second memory access request is not associated with the any additional dat

Assignees

Inventors

Classifications

  • with prefetch · CPC title

  • Globally asynchronous, locally synchronous, e.g. network on chip · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • for main memory peripheral accesses (e.g. I/O or DMA) · CPC title

  • Prefetching based on access pattern detection, e.g. stride based prefetch · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599470B2 cover?
A last-level collective hardware prefetcher (LLCHP) is described. The LLCHP is to detect a first off-chip memory access request by a first processor core of a plurality of processor cores. The LLCHP is further to determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second proces…
Who is the assignee on this patent?
Univ California
What technology area does this patent fall under?
Primary CPC classification G06F12/0862. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).