What technology area does this patent fall under?

Primary CPC classification G06F9/5088. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Tail latency-based job offloading in load-balanced groups

US10768997B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10768997-B2
Application number	US-201615368763-A
Country	US
Kind code	B2
Filing date	Dec 5, 2016
Priority date	Dec 5, 2016
Publication date	Sep 8, 2020
Grant date	Sep 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A type of a request that is currently being processed at a system is determined. A distribution is selected from a set of processing time distributions, the distribution forming a model that is applicable to the type. A threshold point is computed for the model. A processing time that exceeds a threshold point processing time is regarded as exhibiting tail latency. Tail latency includes a delay in processing of the request due to a reason other than a utilization of a resource of the system exceeding a threshold utilization and a size of a queue in the system exceeding a threshold size. An evaluation is made that the request will experience tail latency during processing at the system and the processing of the request at the system is aborted. The request is offloaded for processing at a peer system in a load-balanced group of systems.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: determining a type of a request that is currently being processed at a data processing system; selecting a distribution from a set of processing time distributions, the distribution forming a model that is applicable to the type; computing a threshold point for the model, wherein a processing time that exceeds a processing time at the threshold point is regarded as exhibiting tail latency according to the model, tail latency comprising a delay in processing of the request due to a reason other than a utilization of a resource of the data processing system exceeding a threshold utilization and a size of a queue in the data processing system exceeding a threshold size; evaluating that the request will experience tail latency during processing at the data processing system; aborting, responsive to the evaluating and prior to the request reaching the threshold, processing of the request at the data processing system; offloading the request for processing at a peer data processing system in a load-balanced group of data processing systems; identifying a first queued request in the queue of the data processing system, the first queued request awaiting processing at the data processing system; determining a first type of the first queued request; selecting a first model from a first subset of the set of models, the first model corresponding to the first type; estimating a first processing time of the first queued request using the first model; evaluating that the first processing time will exceed a first acceptable processing time for the first queued request; removing, responsive to the first processing time exceeding the first acceptable processing time, the first queued request from the queue; and offloading the first queued request for processing at a second peer data processing system in the load-balanced group of data processing systems. 2. The method of claim 1 , further comprising: determining an amount of processing time already consumed in processing the request; estimating an amount of time needed to complete the processing of the request based on an amount of progress made in the processing time already consumed; and computing, as a part of the evaluating, whether the processing time already consumed and the time needed to complete together exceed the processing time at the threshold point in the model. 3. The method of claim 1 , further comprising: determining a system condition prevailing during processing of the request, wherein the system condition is a state of a component of the data processing system; and selecting, responsive to the system condition, the model from a subset of the set of tail latency distribution models, wherein the subset of the set of tail latency distribution models comprises a plurality of models corresponding to the type, and wherein the model in the subset corresponds to the system condition and another model in the subset corresponds to a different system condition. 4. The method of claim 3 , wherein the system condition is a specific application executing on the data processing system during the processing of the request. 5. The method of claim 3 , wherein the system condition is a specific hardware component being selected for use in the data processing system during the processing of the request. 6. The method of claim 3 , wherein the system condition is a network condition existing in a network coupled to the data processing system during the processing of the request. 7. The method of claim 1 , further comprising: identifying a second queued request in the queue of the data processing system, the second queued request awaiting processing at the data processing system; determining a second type of the second queued request; selecting a second model from a second subset of the set of models, the second model corresponding to the second type; estimating a second processing time of the second queued request using the second model; evaluating that a total time to process the second request will exceed a second acceptable processing time for the second queued request; removing, responsive to the total time to process the second request exceeding the second acceptable processing time, the second queued request from the queue; and offloading the second queued request for processing at a third peer data processing system in the load-balanced group of data processing systems. 8. The method of claim 7 , further comprising: determining, as a part of the evaluating that the total time to process the second request will exceed the second acceptable processing time for the second request, whether a total of (i) an estimated processing time of the request, (ii) the second processing time, and (iii) a processing time of each queued request before the second queued request, exceeds the second acceptable processing time for the second request, wherein the first request is one of the queued requests before the second queued request. 9. A computer usable program product comprising a computer-readable storage medium, and program instructions stored on the computer-readable storage medium, the stored program instructions comprising: program instructions to determine a type of a request that is currently being processed at a data processing system; program instructions to select a distribution from a set of processing time distributions, the distribution forming a model that is applicable to the type; program instructions to compute a threshold point for the model, wherein a processing time that exceeds a processing time at the threshold point is regarded as exhibiting tail latency according to the model, tail latency comprising a delay in processing of the request due to a reason other than a utilization of a resource of the data processing system exceeding a threshold utilization and a size of a queue in the data processing system exceeding a threshold size; program instructions to evaluate that the request will experience tail latency during processing at the data processing system; program instruction to abort, responsive to the evaluating and prior to the request reaching the threshold, processing of the request at the data processing system; and program instructions to offload the request for processing at a peer data processing system in a load-balanced group of data processing systems; program instructions to identify a first queued request in the queue of the data processing system, the first queued request awaiting processing at the data processing system; program instructions to determine a first type of the first queued request; program instruction to select a first model from a first subset of the set of models, the first model corresponding to the first type; program instruction to estimate a first processing time of the first queued request using the first model; program instructions to evaluate that the first processing time will exceed a first acceptable processing time for the first queued request; program instructions to remove, responsive to the first processing time exceeding the first acceptable processing time, the first queued request from the queue; and program instructions to offload the first queued request for processing at a second peer data processing system in the load-balanced group of data processing systems. 10. The computer usable program product of claim 9 , further comprising: program instructions to determine an amount of processing time already consumed in processing the request; program instructions to estimate an amount of time needed to complete the processing of the request based on an amount of progress made in the processing time already consumed; and progra

Assignees

Inventors

Classifications

G06F9/5088Primary
involving task migration · CPC title
G06F3/0659
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
G06F9/505
considering the load · CPC title
G06F9/5083Primary
Techniques for rebalancing the load in a distributed system · CPC title
H04L49/15
Interconnection of switching modules · CPC title

Patent family

Related publications grouped by family.

View patent family 62243134

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10768997B2 cover?: A type of a request that is currently being processed at a system is determined. A distribution is selected from a set of processing time distributions, the distribution forming a model that is applicable to the type. A threshold point is computed for the model. A processing time that exceeds a threshold point processing time is regarded as exhibiting tail latency. Tail latency includes a delay…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F9/5088. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Accelerating rocksdb multi-instance performance by introducing random initiation for compaction

System and method for universal timeout in a distributed computing environment in a distributed computing environment

Optimizing workloads in a workload placement system

Intelligently provisioning cloud information services

Virtual network routing

Frequently asked questions