Distributed system resiliency assessment using faults

US10387231B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10387231-B2
Application numberUS-201615273531-A
CountryUS
Kind codeB2
Filing dateSep 22, 2016
Priority dateAug 26, 2016
Publication dateAug 20, 2019
Grant dateAug 20, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for assessing resiliency of a system is provided. A fault injection system may, for each of a plurality of dimensions of a fault profile, access an indication of possible values for the dimension, which may be specified by a user. The fault injection system may, for each of a plurality of fault profiles, automatically create the fault profile by, for each of the plurality of dimensions, selecting by the computing system a possible value for that dimension. For at least some of the fault profiles, the fault injection system injects a fault based on the fault profile into the system and determines whether a failure was detected while the fault was injected.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method performed by a computing system for assessing resiliency of a distributed system of components, the method comprising: for each dimension of a plurality of dimensions, accessing by the computing system, an indication of possible values for the dimension; for each of a plurality of fault profiles, creating by the computing system the fault profile by, for each dimension of the plurality of dimensions, selecting by the computing system an exponentially increasing possible value for that dimension; and for more than one of the fault profiles, injecting a fault based on the fault profile into the distributed system for execution; determining whether a failure was detected during execution of the fault in the distributed system; or whether the components of the distributed system counteracted the injected fault; and assessing resiliency of the distributed system based on a determination as to whether a failure was detected or the components counteracted the injected fault during execution of the fault. 2. The method of claim 1 , further comprising selecting the dimensions from a group comprising fault type, duration, number of machines, and configuration. 3. The method of claim 1 , wherein injecting the fault further comprises injecting faults of increasing fault strength. 4. The method of claim 3 , further comprising: based on a determination that the components did not counteract the detected failure, terminating the injecting of the faults. 5. The method of claim 1 , wherein determining whether a failure was detected comprises basing the determination of whether the failure was detected on an output of a health monitor of the distributed system. 6. The method of claim 1 , further comprising displaying a graphic illustrating the fault profiles for which a failure was detected. 7. The method of claim 1 , further comprising receiving from a user a specification of the possible values for more than one of the dimensions. 8. The method of claim 1 , further comprising receiving a specification of a function for generating possible values for more than one of the dimensions. 9. A computing system for assessing resiliency of a distributed system of components, the computing system comprising: computer-readable storage media storing computer-executable instructions for controlling the computing system to: create a plurality of fault profiles, each fault profile specifying at least one exponentially increasing possible value for each of a plurality of dimensions of the fault profile; inject a fault based on a fault profile of the fault profiles into the distributed system for execution; monitor a health of the distributed system during execution of the fault; determine whether execution of the fault resulted a failure in a component of the distributed system or wherein components of the system counteracted the fault based on the monitored health; and assess resiliency of the distributed system based on a determination as to whether execution of the fault resulted in a failure or the components counteracted the fault during execution of the fault; and a processor for executing the computer-executable instructions stored in the computer-readable storage media. 10. The computing system of claim 9 , wherein the computer-executable instructions further comprise instructions to control the computing system to, upon determining that an injected fault generates a failure, terminate execution of the fault and suppress responsive actions to the failure. 11. The computing system of claim 9 , wherein the dimensions of the fault profile include a fault type, number of virtual machines, duration, and configuration. 12. The computing system of claim 11 , wherein the configuration is an intensity of the fault type. 13. The computing system of claim 9 , wherein the assessing the resiliency of the distributed system is performed in response to a change in deployment of the distributed system. 14. The computing system of claim 9 , wherein the computer-executable instructions include instructions for controlling the computing system to repeatedly determine whether different faults generate a failure until a fault is determined to generate a failure. 15. The computing system of claim 14 , wherein the computer-executable instructions for controlling the computing system are further to cause the computing system to inject faults of increasing fault strength into the distributed system. 16. A method performed by a computing system for assessing resiliency of a system of components, the method comprising: automatically creating, by the computing system, a plurality of fault profiles, each fault profile having dimensions, and each fault profile specifying an exponentially increasing possible value for each dimension; injecting faults based on the fault profiles into the system for execution; monitoring health of the system while the components of the system are executing the injected faults; and based on the monitoring indicating that the system is not healthy, determining whether the injected faults generated a failure in a component of the system or whether the components of the system counteracted the injected faults; assessing resiliency of the system based on a determination as to whether a failure was detected or the components counteracted the injected faults during execution of the faults; and indicating the assessed resiliency of the system based on the determination as to whether the components counteracted the generated failure. 17. The method of claim 16 , further comprising injecting different faults for execution at the same time.

Assignees

Inventors

Classifications

  • Generation of test inputs, e.g. test vectors, patterns or sequences {; with adaptation of the tested hardware for testability with external testers} · CPC title

  • for load management (allocation of a server based on load conditions G06F9/505; load rebalancing G06F9/5083; redistributing the load in a network by a load balancer H04L67/1029) · CPC title

  • Workload generation, e.g. scripts, playback · CPC title

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • Testing arrangements · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10387231B2 cover?
A method and system for assessing resiliency of a system is provided. A fault injection system may, for each of a plurality of dimensions of a fault profile, access an indication of possible values for the dimension, which may be specified by a user. The fault injection system may, for each of a plurality of fault profiles, automatically create the fault profile by, for each of the plurality of…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/0709. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 20 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).