Coordination and increased utilization of graphics processors during inference

US11748841B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11748841-B2
Application numberUS-202217871781-A
CountryUS
Kind codeB2
Filing dateJul 22, 2022
Priority dateApr 24, 2017
Publication dateSep 5, 2023
Grant dateSep 5, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a processing system including a graphics processor, the graphics processor including a plurality of processing resources, the plurality of processing resources configured to be partitioned into a plurality of physical resource slices, wherein the processing system has a capability to limit usage of the plurality of processing resources by a plurality of contexts, the graphics processor includes a plurality of memory partitions, the plurality of physical resource slices includes a plurality of compute resource partitions, and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and circuitry configured to: receive specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; schedule workloads associated with the plurality of contexts to the plurality of physical resource slices according to the limitation on usage specified for the respective contexts of the plurality of contexts; limit execution of workloads for respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts; and monitor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 2. The apparatus of claim 1 , wherein the processing system is configured to limit execution of a context of the plurality of contexts to a compute resource partition of the plurality of compute resource partitions. 3. The apparatus of claim 2 , wherein the compute resource partition is associated with a memory partition of the plurality of memory partitions and is configured to access the memory partition via one or more of the plurality of memory access paths to the plurality of memory partitions. 4. The apparatus of claim 1 , wherein the plurality of memory access paths includes a plurality of cache memory partitions. 5. The apparatus of claim 1 , wherein limitation on usage of the graphics processor is to limit respective contexts of the plurality of contexts to a specified portion of available threads of the plurality of processing resources. 6. The apparatus of claim 1 , wherein the graphics processor includes a single instruction multiple thread (SIMT) architecture. 7. The apparatus of claim 6 , wherein the SIMT architecture includes hardware multithreading. 8. A method comprising: scheduling execution of workloads for a plurality of contexts to a plurality of processing resources in a processing system that includes a graphics processor, the plurality of processing resources configured to be partitioned into a plurality of physical resource slices that include a plurality of compute resource partitions; receiving specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; upon determining that the limitation on usage of the plurality of processing resources is specified for the respective contexts of the plurality of contexts, limiting execution of workloads for the respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts, wherein the graphics processor includes a plurality of memory partitions and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and monitoring, by the graphics processor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 9. The method of claim 8 , wherein usage of the plurality of processing resources by the plurality of contexts is limited in part to increase utilization of the plurality of processing resources. 10. The method of claim 8 , wherein the plurality of processing resources includes a single instruction multiple thread (SIMT) architecture. 11. The method of claim 10 , wherein the SIMT architecture includes hardware multithreading. 12. A data processing system comprising: a memory device to store instructions; a graphics processor coupled with the memory device and configured to execute the instructions, the graphics processor including a plurality of processing resources configured to be partitioned into a plurality of physical resource slices, wherein the processing system has a capability to limit usage of the plurality of processing resources by a plurality of contexts, the graphics processor includes a plurality of memory partitions, the plurality of physical resource slices includes a plurality of compute resource partitions, and the plurality of compute resource partitions is associated with a plurality of memory access paths to the plurality of memory partitions; and circuitry configured to: receive specification of a limitation on usage of the plurality of processing resources by respective contexts of the plurality of contexts; schedule workloads associated with the plurality of contexts to the plurality of physical resource slices according to the limitation on usage specified for the respective contexts of the plurality of contexts; limit execution of workloads for respective contexts of the plurality of contexts to a specified subset of the plurality of processing resources according to the physical resource slices associated with the respective contexts of the plurality of contexts; and monitor, during execution of a workload associated with a context of the plurality of contexts, a utilization percentage of the specified subset of the plurality of processing resources to which the context is limited. 13. The data processing system of claim 12 , wherein the processing system is configured to limit execution of a context of the plurality of contexts to a compute resource partition of the plurality of compute resource partitions. 14. The data processing system of claim 13 , wherein the compute resource partition is associated with a memory partition of the plurality of memory partitions and is configured to access the memory partition via one or more of the plurality of memory access paths to the plurality of memory partitions. 15. The data processing system of claim 12 , wherein the plurality of memory access paths includes a plurality of cache memory partitions. 16. The data processing system of claim 12 , wherein limitation on usage of the graphics processor is to limit respective contexts of the plurality of contexts to a specified portion of available threads of the plurality of processing resources. 17. The data processing system of claim 12 , wherein the graphics processor includes a single instruction multiple thread (SIMT) architecture. 18. The data processing system of claim 17 , wherein the SIMT architecture includes hardware multithreading.

Assignees

Inventors

Classifications

  • Abduction · CPC title

  • using instruction pipelines · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11748841B2 cover?
A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing s…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 05 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).