Vector processor with general purpose register resource management
US-2018210732-A1 · Jul 26, 2018 · US
US11748841B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11748841-B2 |
| Application number | US-202217871781-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 22, 2022 |
| Priority date | Apr 24, 2017 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a processing system including a graphics processor, the graphics processor including a plurality of processing resources, the plurality of processing resources configured to be partitioned into a plurality of physical resource slices, wherein the processing system has a capability to limit usage of the plurality of processing resources by a plurality of contexts, the graphics processor includes a plurality of memory partitions, the plurality of physical resource slices includes a plurality of compute resource partitions, and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and circuitry configured to: receive specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; schedule workloads associated with the plurality of contexts to the plurality of physical resource slices according to the limitation on usage specified for the respective contexts of the plurality of contexts; limit execution of workloads for respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts; and monitor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 2. The apparatus of claim 1 , wherein the processing system is configured to limit execution of a context of the plurality of contexts to a compute resource partition of the plurality of compute resource partitions. 3. The apparatus of claim 2 , wherein the compute resource partition is associated with a memory partition of the plurality of memory partitions and is configured to access the memory partition via one or more of the plurality of memory access paths to the plurality of memory partitions. 4. The apparatus of claim 1 , wherein the plurality of memory access paths includes a plurality of cache memory partitions. 5. The apparatus of claim 1 , wherein limitation on usage of the graphics processor is to limit respective contexts of the plurality of contexts to a specified portion of available threads of the plurality of processing resources. 6. The apparatus of claim 1 , wherein the graphics processor includes a single instruction multiple thread (SIMT) architecture. 7. The apparatus of claim 6 , wherein the SIMT architecture includes hardware multithreading. 8. A method comprising: scheduling execution of workloads for a plurality of contexts to a plurality of processing resources in a processing system that includes a graphics processor, the plurality of processing resources configured to be partitioned into a plurality of physical resource slices that include a plurality of compute resource partitions; receiving specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; upon determining that the limitation on usage of the plurality of processing resources is specified for the respective contexts of the plurality of contexts, limiting execution of workloads for the respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts, wherein the graphics processor includes a plurality of memory partitions and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and monitoring, by the graphics processor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 9. The method of claim 8 , wherein usage of the plurality of processing resources by the plurality of contexts is limited in part to increase utilization of the plurality of processing resources. 10. The method of claim 8 , wherein the plurality of processing resources includes a single instruction multiple thread (SIMT) architecture. 11. The method of claim 10 , wherein the SIMT architecture includes hardware multithreading. 12. A data processing system comprising: a memory device to store instructions; a graphics processor coupled with the memory device and configured to execute the instructions, the graphics processor including a plurality of processing resources configured to be partitioned into a plurality of physical resource slices, wherein the processing system has a capability to limit usage of the plurality of processing resources by a plurality of contexts, the graphics processor includes a plurality of memory partitions, the plurality of physical resource slices includes a plurality of compute resource partitions, and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and circuitry configured to: receive specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; schedule workloads associated with the plurality of contexts to the plurality of physical resource slices according to the limitation on usage specified for the respective contexts of the plurality of contexts; limit execution of workloads for respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts; and monitor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 13. The data processing system of claim 12 , wherein the processing system is configured to limit execution of a context of the plurality of contexts to a compute resource partition of the plurality of compute resource partitions. 14. The data processing system of claim 13 , wherein the compute resource partition is associated with a memory partition of the plurality of memory partitions and is configured to access the memory partition via one or more of the plurality of memory access paths to the plurality of memory partitions. 15. The data processing system of claim 12 , wherein the plurality of memory access paths includes a plurality of cache memory partitions. 16. The data processing system of claim 12 , wherein limitation on usage of the graphics processor is to limit respective contexts of the plurality of contexts to a specified portion of available threads of the plurality of processing resources. 17. The data processing system of claim 12 , wherein the graphics processor includes a single instruction multiple thread (SIMT) architecture. 18. The data processing system of claim 17 , wherein the SIMT architecture includes hardware multithreading.
Abduction · CPC title
using instruction pipelines · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Distributed learning, e.g. federated learning · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.