Assistance for hardware prefetch in cache access

US11934342B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11934342-B2
Application numberUS-202017429277-A
CountryUS
Kind codeB2
Filing dateMar 14, 2020
Priority dateMar 15, 2019
Publication dateMar 19, 2024
Grant dateMar 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a hardware apparatus for data processing, the hardware apparatus including: a circuit element to produce one or more results in processing of one or more applications; a load-store unit to receive the one or more results and generate prefetch information for a cache utilizing the one or more results; and a prefetch generator to produce prefetch addresses based at least in part on the generated prefetch information; wherein the load-store unit of the hardware apparatus is to receive software assistance in defining addresses for prefetching, and wherein generation of the prefetch information by the load-store unit is based at least in part on the received software assistance and the one or more results from the circuit element; and wherein the software assistance includes a suggested prefetch stride for a current application, the suggested prefetch stride being generated by the software based on data request patterns that are detected by the software for a current application. 2. The apparatus of claim 1 , wherein the prefetch stride defines a range of addresses for a prefetch operation. 3. The apparatus of claim 1 , wherein the suggested prefetch stride is modified dynamically by the software in response to changes in the detected data request patterns for the current application. 4. The apparatus of claim 1 , wherein the apparatus includes a graphics processing unit (GPU), the hardware apparatus being a portion of the GPU. 5. The apparatus of claim 1 , wherein the circuit element includes an arithmetic logic unit (ALU). 6. A method comprising: receiving, at a load-store unit of a hardware apparatus, a result produced by a circuit element of the hardware apparatus in processing of an application; receiving, at the load-store unit of the hardware apparatus, software assistance for defining addresses for prefetching; generating prefetch information by the load-store unit; and producing, by a prefetch generator of the hardware apparatus, prefetch addresses based at least in part on the generated prefetch information; wherein the prefetch information is generated by the load-store unit based at least in part on the result from the circuit element and the received software assistance; and wherein the software assistance includes a suggested prefetch stride, the suggested prefetch stride being generated by the software based on data request patterns that are detected by the software for a current application. 7. The method of claim 6 , wherein the prefetch stride defines a range of addresses for a prefetch operation. 8. The method of claim 6 , wherein the suggested prefetch stride is modified dynamically by the software in response to changes in the detected data request patterns for the current application. 9. The method of claim 6 , wherein the apparatus includes a graphics processing unit (GPU), the hardware apparatus being a portion of the GPU. 10. The method of claim 6 , wherein the circuit element includes an arithmetic logic unit (ALU). 11. One or more non-transitory computer-readable storage mediums having stored thereon executable computer program instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: receiving, at a load-store unit of a hardware apparatus, a result produced by a circuit element of the hardware apparatus in processing of an application; receiving, at the load-store unit of the hardware apparatus, software assistance for defining addresses for prefetching; generating prefetch information by the load-store unit; and producing, by a prefetch generator of the hardware apparatus, prefetch addresses based at least in part on the generated prefetch information; wherein the prefetch information is generated by the load-store unit based at least in part on the result from the circuit element and the received software assistance; and wherein the software assistance includes a suggested prefetch stride, the suggested prefetch stride being generated by the software based on data request patterns that are detected by the software for a current application. 12. The one or more computer-readable storage mediums of claim 11 , wherein the prefetch stride defines a range of addresses for a prefetch operation. 13. The one or more computer-readable storage mediums of claim 11 , wherein the suggested prefetch stride is modified dynamically by the software in response to changes in the detected data request patterns for the current application. 14. The one or more computer-readable storage mediums of claim 11 , wherein the apparatus includes a graphics processing unit (GPU), the hardware apparatus being a portion of the GPU. 15. The one or more computer-readable storage mediums of claim 11 , wherein the circuit element includes an arithmetic logic unit (ALU).

Assignees

Inventors

Classifications

  • with memory · CPC title

  • Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

  • Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry · CPC title

  • Random number generators, i.e. based on natural stochastic processes · CPC title

  • Arithmetic instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11934342B2 cover?
Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch i…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F15/7839. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).