Systems and Methods for Efficient Data Preprocessing of Machine Learning Workloads
US-2024403138-A1 · Dec 5, 2024 · US
US9875141B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9875141-B2 |
| Application number | US-24385908-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 1, 2008 |
| Priority date | Oct 1, 2008 |
| Publication date | Jan 23, 2018 |
| Grant date | Jan 23, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Computer systems attempt to manage resource pools of a dynamic number of similar resources and work tasks in order to optimize system performance. Work requests are received into the resource pool having a dynamic number of resources instances. An instance-throughput curve is determined that relates a number of resource instances in the resource pool to throughput of the work requests. A slope of a point on the instance-throughput curve is estimated with stochastic gradient approximation. The number of resource instances for the resource pool is selected when the estimated slope of the instance-throughput curve is zero.
Opening claim text (preview).
What is claimed is: 1. A method of managing a resource pool comprising a dynamic number of resource instances executing work requests from a work queue, the number of thread instances in the resource pool based on a control setting which specifies a desired number of resources to be in the pool, the method comprising: retaining measurement information of the resource pool including a previous history of throughput measurements for a previous control setting and a current history of throughput measurements for a current control setting such that when a new control setting is established the current history becomes the previous history and the new control setting becomes the current control setting; entering an initializing state, the initializing state comprising: collecting throughput measurements of the thread pool operating at the current control setting until there is a plurality of throughput measurements in the current history and there is less than a selected threshold of variance for a mean throughput of the current history; and when there is less than the selected threshold of variance, establishing a first new control setting as the current control setting and exiting the initializing state; and in response to exiting the initializing state, entering a climbing state, the climbing state comprising: collecting throughput measurements of the resource pool operating at the current control setting until there is a plurality of throughput measurements in the current history and there is less than the selected threshold of variance for the mean throughput of the current history; when there is less than the selected threshold of variance, estimating, via stochastic gradient approximation, the slope of a point on an instance-throughput curve using the previous history, the previous control setting, the current history, and the current control setting, the instance-throughput curve relating the number of thread instances in the thread pool to throughput of the resource pool and the point corresponding to the desired number of resources specified in the current control setting; when the estimated slope is zero, reentering the climbing state; and when the estimated slope is not zero, calculating a next new control setting using the estimated slope, establishing the calculated next new control setting as the current control setting, and reentering the climbing state; the collecting throughput measurements in the initializing state and in the climbing state comprising receiving, from a measurement interface of the resource pool, a plurality of throughput measurements and for each received throughput measurement: receiving, from the measurement interface, an actual number of resources representing the number of resource instances executing work requests in the resource pool when the received throughput measurement was taken; comparing the desired number of resources specified in the current control setting to the actual number of resources; discarding the received throughput measurement if either the desired number of resources is less than the actual number of resources or the desired number of threads is greater than the actual number of resources and the work queue is not empty; adding the received throughput measurement to the current history if the received throughput measurement was not discarded; determining, by applying a change-point detection technique to the throughput measurements in the current history, whether the shape of the instance-throughput curve has changed; and in response to determining the shape of the instance-throughput curve has changed, deleting the previous and current history and reentering the initializing state. 2. The method of claim 1 wherein control settings are established via a control interface of the resource pool. 3. The method of claim 1 wherein the measurement information further includes work queue counts. 4. The method of claim 1 wherein the dynamic number of resources in the resource pool are added and taken away as a result of creating resources and destroying resources. 5. The method of claim 1 wherein the point where the estimated slope of the instance-throughput curve is zero indicates the number of resource instances which optimizes throughput based on an assumption that the instance-throughput curve is unimodal. 6. The method of claim 1 further comprising taking an action that minimizes the number of threads in response to determining the throughput measurement in the previous history are substantially similar to the throughput measurements in the current history. 7. The method of claim 1 wherein the resource instances include thread instances. 8. A system for managing a thread pool comprising a dynamic number of thread instances to execute work requests from a work queue, the number of thread instances in the thread pool based on a control setting that specifies a desired number of threads to be in the pool, the system comprising: a memory to store a set of executable instructions; and a processor configured to execute the set of instructions to cause the system to: retain measurement information of the thread pool including a previous history of throughput measurements for a previous control setting and a current history of throughput measurements for a current control setting such that when a new control setting is established the current history becomes the previous history and the new control setting becomes the current control setting; enter an initializing state, the initializing state comprising the system being caused to: collect throughput measurements of the thread pool operating at the current control setting until there is a plurality of throughput measurements in the current history and there is less than a selected threshold of variance for a mean throughput of the current history; and when there is less than the selected threshold of variance, establish a first new control setting as the current control setting and exit the initializing state; and in response to exiting the initializing state, enter a climbing state, the climbing state comprising the system being caused to: collect throughput measurements of the thread pool operating at the current control setting until there is a plurality of throughput measurements in the current history and there is less than the selected threshold of variance for the mean throughput of the current history; when there is less than the selected threshold of variance, estimate, via stochastic gradient approximation, the slope of a point on an instance-throughput curve using the previous history, the previous control setting, the current history, and the current control setting, the instance-throughput curve relating the number of thread instances in the thread pool to throughput of the thread pool and the point corresponding to the desired number of threads specified in the current control setting; when the estimated slope is zero, reenter the climbing state; and when the estimated slope is not zero, calculate a next new control setting using the estimated slope, establish the calculated next new control setting as the current control setting, and reenter the climbing state; the system being caused to collect throughput measurements in the initializing state and in the climbing state comprising the system being caused to receive, from a measurement interface of the thread pool, a plurality of throughput measurements and for each received throughput measurement: receive, from the measurement interface, an actual number of threads representing the number of thread instances executing work requests in the thread pool when the received throughput measurement was taken; compare the desired number of threads specified in the current c
Pool · CPC title
using steepest descent or ascent method · CPC title
Partitioning or combining of resources · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.