Identifying problematic application workloads based on associated response times

US9459799B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9459799-B1
Application numberUS-201414314523-A
CountryUS
Kind codeB1
Filing dateJun 25, 2014
Priority dateJun 25, 2014
Publication dateOct 4, 2016
Grant dateOct 4, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described are techniques that identify problematic workloads. Measured response times for workloads associated applications are received. Each of the applications has one of the workloads resulting in one of the measured response times for the application. The applications share a set of one or more resources. In accordance with a first set of one or more criteria, it is determined whether there is an occurrence of abnormal performance with respect to performance of the applications. Responsive to determining the occurrence of abnormal performance with respect to performance of the applications, second processing is performed that includes determining, using the measured response times and in accordance with a second set of one or more criteria, an application set of one or more of the applications having an associated workload causing the occurrence of abnormal performance. A remediation may also be taken to address or alleviate the abnormal performance.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of identifying problematic workloads comprising: receiving a plurality of measured response times for a plurality of workloads associated with a plurality of applications, each of the plurality of applications having one of the plurality of workloads resulting in one of the plurality of measured response times for said each application, wherein the plurality of applications share a set of one or more resources; determining, in accordance with a first set of one or more criteria, whether there is an occurrence of abnormal performance with respect to performance of the plurality of applications; and responsive to determining the occurrence of abnormal performance with respect to performance of the plurality of applications, performing second processing, said second processing including: determining, using the plurality of measured response times and in accordance with a second set of one or more criteria, an application set of one or more of the plurality of applications each having an associated one of the plurality of workloads causing the occurrence of abnormal performance, wherein the second set of one or more criteria includes a rule that determines a first of the plurality of applications is included in the application set and contributes to the occurrence of abnormal performance if the first application has a first of the plurality of workloads resulting in a first of the plurality of measured response times and the first measured response time is a maximum of all of the plurality of measured response times, wherein the first workload for the first application included in the application set has a bursty interarrival time distribution. 2. The method of claim 1 , wherein each of the plurality of workloads is an I/O workload with respect to an associated one of the plurality of applications issuing I/O operations of the I/O workload. 3. The method of claim 1 , wherein the one or more resources include any of one or more physical storage devices, one or more disk adapters, and one or more processors. 4. The method of claim 1 , wherein the first set of one or more criteria includes determining that there is an occurrence of abnormal performance if any one or more of a service level objective violation and a risk of a service level objective violation are detected. 5. The method of claim 4 , wherein the service level objective violation for one of the plurality of applications and the risk of a service level objective violation for the one application are each determined with respect to a defined service level objective for the one application, wherein the defined service level objective specifies a defined response time and it is determined that the service level objective violation occurs when a measured response time for an associated workload of the one application exceeds the defined response time, and it is determined that the risk of the service level objective occurs for the one application when a measured response time for the associated workload of the one application exceeds a modeled expected response time for the associated workload of the one application. 6. The method of claim 5 , wherein the modeled expected response time is obtained from a modeled performance curve. 7. The method of claim 1 , wherein the second set of one or more criteria includes a second rule that determines whether a difference between the worst of the plurality of measured response times and the best of the plurality of response times is less than 20% of the best measured response time, and if so, then all of the plurality of applications are included in the application set. 8. The method of claim 7 , wherein the second rule further indicates that, if the difference between the worst measured response time and the best response time is not less than 20% of the best measured response time, then any of the plurality of applications having an associated one of the plurality of measured response times larger than 120% of the best measured response time is included in the application set. 9. The method of claim 1 , wherein the second set of one or more criteria includes a second rule that determines whether a difference between the worst of the plurality of measured response times and the best of the plurality of response times is less than 20% of the worst measured response time, and if so, then all of the plurality of applications are included in the application set. 10. The method of claim 9 , wherein the second rule further indicates that, if the difference between the worst measured response time and the best response time is not less than 20% of the worst measured response time, then any of the plurality of applications having an associated one of the plurality of measured response times larger than 120% of worst measured response time is included in the application set. 11. The method of claim 1 , wherein the second processing includes performing a first remediation action for the first application. 12. The method of claim 11 , wherein the plurality of applications each issue I/O operations to one or more logical devices having physical storage provisioned on one or more physical devices of a first data storage system for which the occurrence of abnormal performance has been determined, the one or more physical devices being included in the set of one or more resources shared by the plurality of applications, and wherein the remediation action includes any of a first action notifying a user regarding the first application, the first workload, and one or more logical devices to which the first workload is directed as being problematic, a second action of providing a set of dedicated one or more resources of the first data storage system to the first application, a third action of relocating the one or more logical devices and the first workload of the first application to another data storage system different from the first data storage system and a fourth action of reducing a number of I/O operations sent in a single burst. 13. A data storage system comprising: one or more processors; one or more physical storage devices wherein storage is provisioned from the one or more physical storage devices for a first set of one or more logical devices; a memory comprising code stored therein that, when executed by a processor, performs a method comprising: receiving a plurality of measured response times for a plurality of workloads associated with a plurality of applications, each of the plurality of applications having one of the plurality of workloads resulting in one of the plurality of measured response times for said each application, wherein the plurality of applications share a set of one or more resources; determining, in accordance with a first set of one or more criteria, whether there is an occurrence of abnormal performance with respect to performance of the plurality of applications; and responsive to determining the occurrence of abnormal performance with respect to performance of the plurality of applications, performing second processing, said second processing including: determining, using the plurality of measured response times and in accordance with a second set of one or more criteria, an application set of one or more of the plurality of applications each having an associated one of the plurality of workloads causing the occurrence of abnormal performance, wherein the second set of one or more criteria includes a rule that determines a first of the plurality of applications is included in the application set and contributes to the occurrence of abnormal performance if the first application has a first of the plurality of workloads re

Assignees

Inventors

Classifications

  • Monitoring storage devices or systems · CPC title

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

  • Monitoring of software · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9459799B1 cover?
Described are techniques that identify problematic workloads. Measured response times for workloads associated applications are received. Each of the applications has one of the workloads resulting in one of the measured response times for the application. The applications share a set of one or more resources. In accordance with a first set of one or more criteria, it is determined whether ther…
Who is the assignee on this patent?
Emc Corp
What technology area does this patent fall under?
Primary CPC classification G06F3/0613. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 04 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).