Deep neural network workload scheduling
US-2019266015-A1 · Aug 29, 2019 · US
US11586474B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11586474-B2 |
| Application number | US-201916456551-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2019 |
| Priority date | Jun 28, 2019 |
| Publication date | Feb 21, 2023 |
| Grant date | Feb 21, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for adaptive resource allocation for multiple workloads. One method comprises obtaining a dynamic system model based on a relation between an amount of a resource for multiple iterative workloads and a predefined service metric; obtaining an instantaneous value of the predefined service metric; applying to a given controller associated with a given iterative workload of the multiple iterative workloads: (i) the dynamic system model, (ii) an interference effect of one or more additional iterative workloads on the given iterative workload, and (iii) a difference between the instantaneous value and a target value for the predefined service metric. The given controller applies an adjustment to the amount of the resource for the given iterative workload based at least in part on the difference. The resource allocation for the multiple iterative workloads can be performed in a sequence substantially in parallel with an execution of the iterative workloads.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining a dynamic system model based on a relation between an amount of at least one resource for a plurality of iterative workloads and at least one predefined service metric; obtaining an instantaneous value of the at least one predefined service metric; applying to a given controller associated with a given one of the plurality of iterative workloads: (i) the dynamic system model, (ii) an interference effect that aggregates an amount of an allocation of resources to one or more additional iterative workloads of the plurality of iterative workloads on a performance of the given one of the plurality of iterative workloads, (iii) a self-allocation effect of the given one of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric, wherein the self-allocation effect is determined separately from the interference effect of the one or more additional iterative workloads of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric, and (iv) a difference between the instantaneous value of the at least one predefined service metric and a target value for the at least one predefined service metric, wherein the given controller determines an adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads based at least in part on the difference and the interference effect; and initiating, by the given controller, an application of the determined adjustment to the amount of the at least one resource to the given one of the plurality of iterative workloads; wherein the method is performed by at least one processing device of the given controller, wherein the at least one processing device comprises a processor coupled to a memory. 2. The method of claim 1 , wherein the adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads is determined by substantially minimizing the difference. 3. The method of claim 1 , wherein the obtained system model is one or more of: derived from a relation between an amount of at least one resource added and the predefined service level metric and predefined based on the relation between the amount of the at least one resource added. 4. The method of claim 1 , wherein the obtained system model is updated over time based on an amount of at least one resource added and the one or more predefined service metrics. 5. The method of claim 1 , wherein the given one of the plurality of iterative workloads comprises a training of a Deep Neural Network. 6. The method of claim 1 , wherein the at least one resource comprises one or more of a number of processing cores in a computer processor, a number of processing cores in a graphics processing unit, an amount of memory and an amount of network bandwidth. 7. The method of claim 1 , wherein the determination of the adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads is performed substantially in parallel with an execution of the plurality of iterative workloads. 8. The method of claim 7 , wherein the interference effect of the one or more of the plurality of iterative workloads on the given one of the plurality of iterative workloads is determined in a sequence. 9. The method of claim 8 , wherein one of the plurality of iterative workloads that one or more of finished processing and failed processing is removed from the sequence. 10. The method of claim 8 , wherein a newly deployed workload is added to the sequence. 11. The method of claim 1 , wherein a larger number of the processing steps of the given controller are employed to adapt the self-allocation effect than a number of the processing steps of the given controller employed to adapt the interference effect. 12. A computer program product, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining a dynamic system model based on a relation between an amount of at least one resource for a plurality of iterative workloads and at least one predefined service metric; obtaining an instantaneous value of the at least one predefined service metric; applying to a given controller associated with a given one of the plurality of iterative workloads: (i) the dynamic system model, (ii) an interference effect that aggregates an amount of an allocation of resources to one or more additional iterative workloads of the plurality of iterative workloads on a performance of the given one of the plurality of iterative workloads, (iii) a self-allocation effect of the given one of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric, wherein the self-allocation effect is determined separately from the interference effect of the one or more additional iterative workloads of the plurality of iterative workloads on the given one of the plurality of iterative workloads with respect to the at least one predefined service metric, and (iv) a difference between the instantaneous value of the at least one predefined service metric and a target value for the at least one predefined service metric, wherein the given controller determines an adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads based at least in part on the difference and the interference effect; and initiating, by the given controller, an application of the determined adjustment to the amount of the at least one resource to the given one of the plurality of iterative workloads. 13. The computer program product of claim 12 , wherein the determination of the adjustment to the amount of the at least one resource for the given one of the plurality of iterative workloads is performed substantially in parallel with an execution of the plurality of iterative workloads. 14. The computer program product of claim 13 , wherein the interference effect of the one or more of the plurality of iterative workloads on the given one of the plurality of iterative workloads is determined in a sequence. 15. The computer program product of claim 14 , wherein one of the plurality of iterative workloads that one or more of finished processing and failed processing is removed from the sequence. 16. The computer program product of claim 14 , wherein a newly deployed workload is added to the sequence. 17. The computer program product of claim 12 , wherein a larger number of the processing steps of the given controller are employed to adapt the self-allocation effect than a number of the processing steps of the given controller employed to adapt the interference effect. 18. An apparatus, comprising: a memory; and at least one processing device, coupled to the memory, operative to implement the following steps: obtaining a dynamic system model based on a relation between an amount of at least one resource for a plurality of iterative workloads and at least one predefined service metric; obtaining an instantaneous value of the at least one predefined service metric; applying to a given controller associated with a given one of the plurality of iterative workloads: (i) the dynamic system model,
Related publications grouped by family.
Answers are generated from the same data shown on this page.