Load store bank aware thread scheduling techniques

US12504989B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12504989-B2
Application numberUS-202217699992-A
CountryUS
Kind codeB2
Filing dateMar 21, 2022
Priority dateMar 21, 2022
Publication dateDec 23, 2025
Grant dateDec 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Bank aware thread scheduling and early dependency clearing techniques are described herein. In one example, bank aware thread scheduling involves arbitrating and scheduling threads based on the cache bank that is to be accessed by the instructions to avoiding bank conflicts. Early dependency clearing involves clearing dependencies for cache loads in a scoreboard before the data is loaded. In early dependency clearing for loads, delays in operation can be reduced by clearing dependencies before data is required from the cache.

First claim

Opening claim text (preview).

What is claimed is: 1 . A graphics processing unit (GPU) comprising: a cache including multiple banks; and hardware logic to schedule threads to access the cache, including to: determine which banks of the cache are to be accessed by threads available for scheduling, select a plurality of the threads available for scheduling based on the banks to be accessed by the threads, including to select threads that are to access different banks of the cache, and schedule the selected threads for execution. 2 . The GPU of claim 1 , wherein the hardware logic is to: determine if two or more of the selected threads have a bank conflict with instructions to access a same bank of the cache; and in response to the bank conflict, select another thread that is to access a different bank. 3 . The GPU of claim 1 , wherein the hardware logic to select the plurality of the threads is to: select the plurality of the threads based on which pipeline is to be used by the threads, including to select the threads that are to access different pipelines. 4 . The GPU of claim 3 , wherein the hardware logic is to: determine if two or more of the selected threads have a pipeline conflict with instructions to be sent to a same pipeline; and in response to the pipeline conflict, select another thread to replace the thread with the pipeline conflict. 5 . The GPU of claim 3 , wherein: the pipelines include: an integer pipeline, a floating point pipeline, and an extended math pipeline. 6 . The GPU of claim 3 , wherein: a number of threads to be selected for scheduling is equal to a number of pipelines. 7 . The GPU of claim 1 , wherein the hardware logic to select a plurality of the threads is to: select instructions from the threads for scheduling only from instructions that do not have dependencies within threads or across threads. 8 . The GPU of claim 1 , wherein the hardware logic is to: schedule a first instruction for execution; and clear a dependency for a second instruction that is dependent on the first instruction in response to scheduling the first instruction before data is loaded for the first instruction. 9 . The GPU of claim 8 , wherein the hardware logic to clear the dependency is to: clear the dependency in a scoreboard in response to scheduling the first instruction before the data is loaded. 10 . A system comprising: a memory device; and graphics processing unit (GPU) coupled with the memory device, the GPU including: a cache including multiple banks; and hardware logic to schedule threads to access the cache, including to: determine which banks of the cache are to be accessed by threads available for scheduling, select a plurality of the threads available for scheduling based on the banks to be accessed by the threads, including to select threads that are to access different banks of the cache, and schedule the selected threads for execution. 11 . The system of claim 10 , wherein the hardware logic is to: determine if two or more of the selected threads have a bank conflict with instructions to access a same bank of the cache; and in response to the bank conflict, select another thread that is to access a different bank. 12 . The system of claim 10 , wherein the hardware logic to select the plurality of the threads is to: select the plurality of the threads based on which pipeline is to be used by the threads, including to select the threads that are to access different pipelines. 13 . The system of claim 12 , wherein the hardware logic is to: determine if two or more of the selected threads have a pipeline conflict with instructions to be sent to a same pipeline; and in response to the pipeline conflict, select another thread to replace the thread with the pipeline conflict. 14 . The system of claim 12 , wherein: the pipelines include: an integer pipeline, a floating point pipeline, and an extended math pipeline. 15 . The system of claim 12 , wherein: a number of threads to be selected for scheduling is equal to a number of pipelines. 16 . The system of claim 10 , wherein the hardware logic to select a plurality of the threads is to: select instructions from the threads for scheduling only from instructions that do not have dependencies within threads or across threads. 17 . The system of claim 10 , wherein the hardware logic is to: schedule a first instruction for execution; and clear a dependency for a second instruction that is dependent on the first instruction in response to scheduling the first instruction before data is loaded for the first instruction. 18 . The system of claim 17 , wherein the hardware logic to clear the dependency is to: clear the dependency in a scoreboard in response to scheduling the first instruction before the data is loaded. 19 . A method comprising: determining which banks of a cache are to be accessed by threads available for scheduling; selecting a plurality of the threads available for scheduling based on the banks to be accessed by the threads, including selecting threads that are to access different banks of the cache; and scheduling the selected threads for execution. 20 . The method of claim 19 , further comprising: determining if two or more of the selected threads have a bank conflict with instructions to access a same bank of the cache; and in response to the bank conflict, selecting another thread that is to access a different bank.

Assignees

Inventors

Classifications

  • to service a request · CPC title

  • Task transfer initiation or dispatching · CPC title

  • Program initiating; Program switching, e.g. by interrupt · CPC title

  • Partitioning or combining of resources · CPC title

  • Allocation of resources, e.g. of the central processing unit [CPU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12504989B2 cover?
Bank aware thread scheduling and early dependency clearing techniques are described herein. In one example, bank aware thread scheduling involves arbitrating and scheduling threads based on the cache bank that is to be accessed by the instructions to avoiding bank conflicts. Early dependency clearing involves clearing dependencies for cache loads in a scoreboard before the data is loaded. In ea…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).