What technology area does this patent fall under?

Primary CPC classification G06T1/20. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 02 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Memory prefetching in multiple GPU environment

US11861759B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11861759-B2
Application number	US-202217580352-A
Country	US
Kind code	B2
Filing date	Jan 20, 2022
Priority date	Mar 15, 2019
Publication date	Jan 2, 2024
Grant date	Jan 2, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a plurality of processors including a host processor and a plurality of graphics processing units (GPUs) to process data including at least a first graphics processing unit (GPU), each of the plurality of GPUs including a prefetcher and one or more caches; and a memory for storage of data; wherein the prefetcher of each of the plurality of GPUs is to prefetch data from the memory to a cache of the respective GPU; wherein a prefetch operation by the first GPU includes the prefetcher of the first GPU issuing a gather/scatter prefetch message including a plurality of prefetch addresses; and wherein the plurality of processors are to parse the gather/scatter prefetch message and issue a prefetch message for each of the plurality of prefetch addresses. 2. The apparatus of claim 1 , wherein the gather/scatter prefetch message includes an entry for each of the plurality of prefetch addresses, the entry to indicate a cache level for prefetching. 3. The apparatus of claim 2 , wherein the gather/scatter prefetch message includes a plurality of different cache levels within the gather/scatter prefetch message. 4. The apparatus of claim 1 , wherein the plurality of prefetch addresses includes noncontiguous addresses. 5. The apparatus of claim 1 , wherein the prefetcher of the first GPU is to send a notification to a thread in a core of the first GPU when a prefetch for the thread is complete. 6. The apparatus of claim 1 , wherein the prefetchers of the plurality of GPUs are to prefetch data from the memory to a cache of each respective GPU in execution of a multi-GPU workload. 7. One or more non-transitory computer-readable storage mediums having stored thereon executable computer program instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: processing a workload in a computing system including a plurality of processors, the plurality of processors including a host processor and a plurality of graphics processing units (GPUs), each of the plurality of GPUs including a prefetcher and one or more caches; prefetching data by a first graphics processing unit (GPU) of the plurality of GPUs from a memory of the computing system to a cache of the first GPU wherein prefetching data by the first GPU includes the prefetcher of the first GPU issuing a gather/scatter prefetch message including a plurality of prefetch addresses; and parsing the gather/scatter prefetch message and issuing a prefetch message for each of the plurality of prefetch addresses. 8. The one or more computer-readable storage mediums of claim 7 , wherein the gather/scatter prefetch message includes an entry for each of the plurality of prefetch addresses, the entry to indicate a cache level for prefetching. 9. The one or more computer-readable storage mediums of claim 8 , wherein the gather/scatter prefetch message includes a plurality of different cache levels within the gather/scatter prefetch message. 10. The one or more computer-readable storage mediums of claim 7 , wherein the plurality of prefetch addresses includes noncontiguous addresses. 11. The one or more computer-readable storage mediums of claim 7 , wherein the instructions further include instructions for: sending a notification to a thread in a core of the first GPU when a prefetch for the thread is complete. 12. The one or more computer-readable storage mediums of claim 7 , wherein the workload is a multi-GPU workload, and wherein the prefetchers of the plurality of GPUs are to prefetch data from the memory to a cache of each respective GPU in execution of the multi-GPU workload. 13. A method comprising: processing a workload in a computing system including a plurality of processors, the plurality of processors including a host processor and a plurality of graphics processing units (GPUs), each of the plurality of GPUs including a prefetcher and one or more caches; prefetching data by a first graphics processing unit (GPU) of the plurality of GPUs from a memory of the computing system to a cache of the first GPU wherein prefetching data by the first GPU includes the prefetcher of the first GPU issuing a gather/scatter prefetch message including a plurality of prefetch addresses; and parsing the gather/scatter prefetch message and issuing a prefetch message for each of the plurality of prefetch addresses. 14. The method of claim 13 , wherein the gather/scatter prefetch message includes an entry for each of the plurality of prefetch addresses, the entry to indicate a cache level for prefetching. 15. The method of claim 14 , wherein the gather/scatter prefetch message includes a plurality of different cache levels within the gather/scatter prefetch message. 16. The method of claim 13 , wherein the plurality of prefetch addresses includes noncontiguous addresses. 17. The method of claim 13 , further comprising: sending a notification to a thread in a core of the first GPU when a prefetch for the thread is complete. 18. The method of claim 13 , wherein the workload is a multi-GPU workload, and wherein the prefetchers of the plurality of GPUs are to prefetch data from the memory to a cache of each respective GPU in execution of the multi-GPU workload.

Assignees

Intel Corp

Inventors

Classifications

G06T1/20Primary
Processor architectures; Processor configuration, e.g. pipelining · CPC title
G06F9/3802
Instruction prefetching · CPC title
G06F9/3877
using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title
G06T1/60
Memory management · CPC title
G06T15/005
General purpose rendering architectures · CPC title

Patent family

Related publications grouped by family.

View patent family 69845533

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11861759B2 cover?: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 02 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).