Scheduling method and processing device using the same
US-2017139751-A1 · May 18, 2017 · US
US10452397B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10452397-B2 |
| Application number | US-201715477022-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 1, 2017 |
| Priority date | Apr 1, 2017 |
| Publication date | Oct 22, 2019 |
| Grant date | Oct 22, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system, allocate a second number of streaming multiprocessors (SMs) to the respective plurality of contexts, and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. Other embodiments are also disclosed and claimed.
Opening claim text (preview).
The invention claimed is: 1. A graphics multiprocessor comprising: an instruction cache to receive a stream of instructions from a pipeline manager; an instruction unit to execute the stream of instructions; a general-purpose graphics processing compute block comprising a plurality of streaming multiprocessors (SMs), each streaming multiprocessor comprising a plurality of graphics processing cores; a shared memory communicatively coupled to the plurality of graphics processing cores; and a processing unit to: determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; allocate a second number of the streaming multiprocessors (SMs) to the respective plurality of contexts based on a ratio of the threads between the plurality of contexts; and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. 2. The graphics multiprocessor of claim 1 , the graphics processing unit to: determine whether one or more of the plurality of contexts have one or more extra threads that do not fit within the second number of streaming multiprocessors allocated to plurality of contexts. 3. The graphics multiprocessor of claim 2 , the graphics processing unit, in response to a determination that one or more of the plurality of contexts have one or more extra threads that do not fit within the second number of streaming multiprocessors allocated to plurality of contexts, is to: implement a process to assign the one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 4. The graphics multiprocessor of claim 3 , the graphics processing unit to: obtain a cache footprint usage parameter for each of the threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; and store the cache footprint usage parameter in a command buffer as a kernel thread meta-data. 5. The graphics multiprocessor of claim 4 , the graphics processing unit to: forward the cache footprint usage parameter to a thread dispatcher. 6. The graphics multiprocessor of claim 5 , the graphics processing unit to: use the cache footprint usage parameter to allocate one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 7. The graphics multiprocessor of claim 1 , wherein the second number of streaming multiprocessors (SMs) are allocated to the respective plurality of contexts based on a ratio of the number of contexts per thread. 8. An electronic device, comprising: a display; an instruction cache to receive a stream of instructions from a pipeline manager; an instruction unit to execute the stream of instructions; a general-purpose graphics processing compute block comprising a plurality of streaming multiprocessors (SMs), each streaming multiprocessor comprising a plurality of graphics processing cores; a shared memory communicatively coupled to the plurality of graphics processing cores; and a processing unit to: determine a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; allocate a second number of the streaming multiprocessors (SMs) to the respective plurality of contexts based on a ratio of the threads between the plurality of contexts; and dispatch threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. 9. The electronic device of claim 8 , the graphics processing unit to: determine whether one or more of the plurality of contexts have one or more extra threads that do not fit within the second number of streaming multiprocessors allocated to plurality of contexts. 10. The electronic device of claim 9 , the graphics processing unit, in response to a determination that one or more of the plurality of contexts have one or more extra threads that do not fit within the second number of streaming multiprocessors allocated to plurality of contexts, is to: implement a process to assign the one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 11. The electronic device of claim 10 , the graphics processing unit to: obtain a cache footprint usage parameter for each of the threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; and store the cache footprint usage parameter in a command buffer as a kernel thread meta-data. 12. The electronic device of claim 11 , the graphics processing unit to: forward the cache footprint usage parameter to a thread dispatcher. 13. The electronic device of claim 12 , the graphics processing unit to: use the cache footprint usage parameter to allocate one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 14. The electronic device of claim 8 , wherein the second number of streaming multiprocessors (SMs) are allocated to the respective plurality of contexts based on a ratio of the number of contexts per thread. 15. A method comprising: receiving, in an instruction cache, a stream of instructions from a pipeline manager; executing, in an instruction unit, the stream of instructions; determining, in a general purpose graphics processing unit comprising a plurality of streaming multiprocessors (SMs), each streaming multiprocessor comprising a plurality of graphics processing cores, a first number of threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; allocating a second number of the streaming multiprocessors (SMs) to the respective plurality of contexts based on a ratio of the threads between the plurality of contexts; and dispatching threads from the plurality of contexts only to the streaming multiprocessor(s) allocated to the respective plurality of contexts. 16. The method of claim 15 , further comprising: determining whether one or more of the plurality of contexts have one or more extra threads that do not fit within the second number of streaming multiprocessors allocated to plurality of contexts. 17. The method of claim 16 , further comprising implementing a process to assign the one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 18. The method of claim 17 , further comprising: obtaining a cache footprint usage parameter for each of the threads to be scheduled for each context of a plurality of contexts in a multi-context processing system; and storing the cache footprint usage parameter in a command buffer as a kernel thread meta-data. 19. The method of claim 18 , further comprising: forwarding the cache footprint usage parameter to a thread dispatcher. 20. The method of claim 19 , further comprising: using the cache footprint usage parameter to allocate one or more extra threads to one or more streaming multiprocessors (SMs) which are assigned to a different context. 21. The method of claim 15 , wherein the second number of streaming multiprocessors (SMs) are allocated to the respective plurality of contexts based on a ratio of the number of contexts per thread. 22. One or more non-transitory computer-readable medium comprising one or more instructions that when executed on a general purpose graphics p
Graphics controllers · CPC title
the resource being the memory · CPC title
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Power processing, i.e. workload management for processors involved in display operations, such as CPUs or GPUs · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.