Synthetic Grouping of Processing Tasks
US-2024403111-A1 · Dec 5, 2024 · US
US10169091B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10169091-B2 |
| Application number | US-201213660799-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 25, 2012 |
| Priority date | Oct 25, 2012 |
| Publication date | Jan 1, 2019 |
| Grant date | Jan 1, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.
Opening claim text (preview).
What is claimed is: 1. A method for scheduling tasks for execution in a parallel processor comprising two or more streaming multiprocessors, the method comprising: receiving a set of tasks associated with a first processing context related to a first page table included in a plurality of page tables; selecting a first task that is associated with a first address space identifier (ASID) from the set of tasks and associated with the first processing context; determining a minimum a number of streaming multiprocessors included in the two or more streaming multiprocessors able to execute the tasks included in the set of tasks based on a number of tasks each streaming multiprocessor is able to execute concurrently, wherein the minimum number of streaming multiprocessors includes at least a first streaming multiprocessor; assigning the tasks included in the set of tasks to the minimum number of streaming multiprocessors; selecting the first streaming multiprocessor from the two or more streaming multiprocessors to execute the first task; scheduling the first task to execute on the first streaming multiprocessor; selecting a second task that is associated with a second ASID from the set of tasks and associated with the first processing context; and scheduling the second task to execute on the first streaming multiprocessor, wherein scheduling the second task occurs prior to scheduling any other task from the set of tasks to execute on a second streaming multiprocessor included in the two or more streaming multiprocessors. 2. The method of claim 1 , wherein selecting the first streaming multiprocessor comprises identifying that the first streaming multiprocessor has previously been assigned a task included in the set of tasks associated with the first processing context, which establishes that the first streaming multiprocessor has an affinity to the first processing context. 3. The method of claim 2 , wherein selecting the first streaming multiprocessor minimizes a maximum prevailing workload for all streaming multiprocessor executing tasks associated with the first processing context. 4. The method of claim 1 , wherein the first task comprises a thread grid. 5. The method of claim 1 , wherein the first page table includes virtual address to physical address mappings associated with a first virtual address space corresponding to the first processing context, and a second page table includes virtual address to physical address mappings associated a second virtual address space corresponding to the first processing context. 6. The method of claim 1 , wherein the first page table and a second page table are included in the plurality of page tables, and each page table included in the plurality of page tables includes virtual address to physical address mappings associated a different virtual address space. 7. The method of claim 1 , further comprising: receiving a bind command from a front end context switch; and in response, associating the first page table with the first ASID. 8. The method of claim 1 , further comprising: receiving a bind command from a front end context switch; and in response, invalidating one or more entries in a first translation lookaside buffer (TLB) that are associated with the first context. 9. The method of claim 1 , wherein the first task is associated with a first thread program executing on the first streaming multiprocessor and the second task is associated with a second thread program executing on the first streaming multiprocessor. 10. The method of claim 9 , wherein the first streaming multiprocessor simultaneously executes the first thread program and the second thread program within the first processing context. 11. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to schedule tasks for execution on a first streaming multiprocessor unit, the method comprising: receiving a set of tasks associated with a first processing context related to a first page table included in a plurality of page tables; selecting a first task that is associated with a first address space identifier (ASID) from the set of tasks and associated with the first processing context; determining a minimum number of streaming multiprocessors included in the two or more streaming multiprocessors able to execute the tasks included in the set of tasks based on a number of tasks each streaming multiprocessor is able to execute concurrently, wherein the minimum number of streaming multiprocessors includes at least a first streaming multiprocessor; assigning the tasks included in the set of tasks to the minimum number of streaming multiprocessors; selecting the first streaming multiprocessor from the two or more streaming multiprocessors to execute the first task; scheduling the first task to execute on the first streaming multiprocessor; selecting a second task that is associated with a second ASID from the set of tasks and associated with the first processing context; and scheduling the second task to execute on the first streaming multiprocessor, wherein scheduling the second task occurs prior to scheduling any other task from the set of tasks to execute on a second streaming multiprocessor included in the two or more streaming multiprocessors. 12. The computer-readable storage medium of claim 11 , wherein selecting the first streaming multiprocessor comprises identifying that the first streaming multiprocessor has previously been assigned a task associated with the first processing context, which establishes that the first streaming multiprocessor has an affinity to the first processing context. 13. The computer-readable storage medium of claim 12 , wherein selecting the first streaming multiprocessor minimizes a maximum prevailing workload for all streaming multiprocessor executing tasks associated with the first processing context. 14. The computer-readable storage medium of claim 11 , wherein the first task comprises a thread grid. 15. The computer-readable storage medium of claim 11 , wherein selecting the first streaming multiprocessor maximizes at least one of a translation lookaside buffer (TLB) cache affinity and a data cache affinity relative to the tasks included in the set of tasks. 16. The computer-readable storage medium of claim 11 , further comprising determining that the tasks included in the set of tasks are associated with a first number of different ASIDs, and in response, determining that the tasks included in the set of tasks should be assigned to the minimum number of streaming multiprocessors included in the two or more streaming multiprocessors. 17. A computing device, comprising: a central processing unit that executes a process having a first processing context; and a parallel processing subunit coupled to the central processing unit, comprising: a subsystem that includes a streaming multiprocessor that: receives a set of tasks associated with a first processing context related to a first page table included in a plurality of page tables; selects a first task that is associated with a first address space identifier (ASID) from the set of tasks and associated with the first processing context; determines a minimum number of streaming multiprocessors included in the two or more streaming multiprocessors able to execute the tasks included in the set of tasks based on a number of tasks each streaming multiprocessor is able to execute concurrently, wherein the minimum number of streaming multiprocessors includes at least a first streaming multiprocessor
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Program initiating; Program switching, e.g. by interrupt · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Hypervisor-specific management and integration aspects · CPC title
Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.