Data pipeline prioritization and prediction
US-2020279173-A1 · Sep 3, 2020 · US
US10853082B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10853082-B1 |
| Application number | US-201916399773-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 30, 2019 |
| Priority date | Apr 30, 2019 |
| Publication date | Dec 1, 2020 |
| Grant date | Dec 1, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer implemented system is described for assigning executable jobs to pipeline sets, whereby the jobs may be network based computer jobs. The assigning includes generating a weight for each pipeline set of multiple pipeline sets to obtain multiple weights. Generating a weight includes obtaining duty cycle metrics for pipeline software threads in the pipeline set. The duty cycle metrics include a measure of an amount of time that a corresponding pipeline thread is executing and actively processing data. Generating the weight further includes determining the weight for the pipeline set based at least in part on the duty cycle metrics. The method further includes assigning a job request to a target pipeline set selected from the pipeline sets according to a weighted random algorithm, wherein the weighted random algorithm uses the weights.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method for assigning executable jobs to pipeline sets comprising: generating a weight for each pipeline set of a plurality of pipeline sets to obtain a plurality of weights, wherein generating the weight for each pipeline set comprises: obtaining a plurality of duty cycle metrics for a plurality of pipeline threads in the pipeline set, wherein the plurality of duty cycle metrics comprises a measure of an amount of time that a corresponding pipeline thread is executing and actively processing data, and determining the weight for the pipeline set based at least in part on the plurality of duty cycle metrics; and assigning a job request to a target pipeline set selected from the plurality of pipeline sets according to a weighted random algorithm, wherein the weighted random algorithm uses the plurality of weights. 2. The computer implemented method of claim 1 , wherein generating the weight further comprises: determining an average number of data ingestion tasks assigned to the plurality of pipeline sets, wherein determining the weight is based on the average number of data ingestion tasks. 3. The computer implemented method of claim 1 , wherein obtaining a duty cycle metric comprises: assigning a duty cycle ownership object to a thread, when a set of conditions is satisfied, setting the duty cycle ownership object to an on state, and processing a data object in a consumer queue while the duty cycle ownership object is in the on state, wherein the duty cycle ownership object is switched to an off state after processing the data object, wherein the set of conditions comprising the data object existing in the consumer queue of the thread, and keeping the duty cycle ownership object in the off state when the consumer queue is empty. 4. The computer implemented method of claim 1 , wherein obtaining a duty cycle metric comprises: assigning a duty cycle ownership object to a thread, and when a set of conditions is satisfied, setting the duty cycle ownership object to an on state within a scope defined by determining that the set of conditions are satisfied, processing a data object in a consumer queue while within the scope, and exiting the scope to switch the duty cycle ownership object to an off state, wherein the set of conditions comprising the data object existing in the consumer queue of the thread. 5. The computer implemented method of claim 1 , wherein obtaining a duty cycle metric comprises: reading a plurality of timestamps and state information from a duty cycle ownership object, and generating, as a duty cycle metric, a weighted moving average using the plurality of timestamps and the state information. 6. The computer implemented method of claim 1 , wherein generating the weight comprises: set, for at least a subset of the plurality of pipeline sets, a pipeline set duty cycle metric as a maximal duty cycle metric of the plurality of pipeline threads in the pipeline set, generating a sliding window average of incoming jobs to the plurality of pipeline sets, and using the sliding window average and the thread set duty cycle metric to determine the weight for the pipeline set. 7. The computer implemented method of claim 1 , wherein generating the weight comprises: determine, for the pipeline set, a pipeline set duty cycle metric from the plurality of duty cycle metrics, generating a sliding window average of incoming jobs to the plurality of pipeline sets, and when the sliding window average is equal to zero, calculating a sum of the pipeline set duty cycle metric across the plurality of pipeline sets, and determining an assigned value for the pipeline set as a difference between the sum and the pipeline set duty cycle metric, normalizing the assigned value across the plurality of pipeline sets to obtain the plurality of weights. 8. The computer implemented method of claim 1 , wherein determining the weight for the pipeline set comprises: determine, for the pipeline set, a pipeline set duty cycle metric from the plurality of duty cycle metrics, generating a sliding window average of incoming jobs to the plurality of pipeline sets, and when the sliding window average is greater than zero, assigning a value to the pipeline set using a periodic update model to obtain an assigned value, and normalizing the assigned value across the plurality of pipeline sets to obtain the plurality of weights. 9. The computer implemented method of claim 1 , wherein determining the weight for the pipeline set comprises: sorting the plurality of pipeline sets in increasing order of a plurality of pipeline set duty cycle metrics to create an ordered list, wherein each pipeline set duty cycle metric is determined from the plurality of duty cycle metrics for a corresponding pipeline set, generating a sliding window average of incoming jobs to the plurality of pipeline sets, and identifying a maximal position in the ordered list in which the sum of a pipeline set duty cycle metric at the maximal position minus a duty cycle metric of a subset of the plurality of duty cycle metrics up to the maximal position is greater than or equal to the sliding window average, the pipeline set duty cycle metric being in the plurality of pipeline set duty cycle metrics, and setting an assigned value to zero for each pipeline set in the ordered list that is at a position greater than the maximal position. 10. The computer implemented method of claim 1 , wherein determining the weight for the pipeline set comprises: sorting the plurality of pipeline sets in increasing order of a plurality of pipeline set duty cycle metrics to create an ordered list, wherein each pipeline set duty cycle metric is determined from the plurality of duty cycle metrics for a corresponding pipeline set, generating a sliding window average of incoming jobs to the plurality of pipeline sets, and identifying a maximal position in the ordered list in which the sum of a pipeline set duty cycle metric at the maximal position minus a duty cycle metric of a subset of the plurality of duty cycle metrics up to the maximal position is greater than or equal to the sliding window average, the pipeline set duty cycle metric being in the plurality of pipeline set duty cycle metrics, for each pipeline set of the plurality of pipeline sets that is at a position less than the maximal position in the ordered list, assigning the pipeline set an assigned value calculated as a function of a difference between a duty cycle metric of the pipeline set and a duty cycle metric at the maximal position, a rate adjustment, and the sliding window average, and normalize the assigned value across the plurality of pipeline sets to obtain the plurality of weights. 11. A computer implemented system for assigning executable jobs to pipeline sets comprising: at least one aggregation thread configured to: generate a weight for each pipeline set of a plurality of pipeline sets to obtain a plurality of weights, wherein generating the weight for each pipeline set comprises: obtaining a plurality of duty cycle metrics for a plurality of pipeline threads in the pipeline set, wherein the plurality of duty cycle metrics comprises a measure of an amount of time that a corresponding pipeline thread is executing and actively processing data, and determining the weight for the pipeline set based at least in part on the plurality of duty cycle metrics; and an assigner executing on a computer processor and configured to: assign a job request to a target pipeline set selected from the plurality of pipeline sets according to a weighted random algorithm, wherein the weighted random algorithm uses the
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
for accessing one among a plurality of replicated servers · CPC title
Techniques for rebalancing the load in a distributed system · CPC title
by program, e.g. task dispatcher, supervisor, operating system · CPC title
Task transfer initiation or dispatching · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.