System and method for using failure casting to manage failures in a computed system

US9684554B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9684554-B2
Application numberUS-201313739251-A
CountryUS
Kind codeB2
Filing dateJan 11, 2013
Priority dateMar 27, 2007
Publication dateJun 20, 2017
Grant dateJun 20, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method for using failure casting to manage failures in computer system. In accordance with an embodiment, the system uses a failure casting hierarchy to cast failures of one type into failures of another type. In doing this, the system allows incidents, problems, or failures to be cast into a (typically smaller) set of failures, which the system knows how to handle. In accordance with a particular embodiment, failures can be cast into a category that is considered reboot-curable. If a failure is reboot-curable then rebooting the system will likely cure the problem. Examples include hardware failures, and reboot-specific methods that can be applied to disk failures and to failures within clusters of databases. The system can even be used to handle failures that were hitherto unforeseen—failures can be cast into known failures based on the failure symptoms, rather than any underlying cause.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of managing failures in a computing system, wherein the method is implemented at least partly by a device, and wherein the method comprises: detecting a failure of a first failure type in the computing system; casting the first failure type to a second failure type, different that the first failure type, wherein the second failure type has an associated failure recovery; and attempting to resolve the first failure type by using the failure recovery associated with the second failure type. 2. The method of 1 , wherein the attempting to resolve the first failure type by using the failure recovery associated with the second failure type occurs at boot and/or start-up time. 3. The method of 1 , wherein the computing system includes an array of devices and the first and second failure types are associated with failures of the array of devices. 4. The method of 1 , wherein the method further comprises: using a failure casting hierarchy in a script that includes a set of non-reboot curable failures that are checked at boot time, and if a device exhibits a failure upon bootup within the set of non-reboot-curable failures, then the disk is not added to the array of devices. 5. A device that includes one or more processors configured to manage failures in a computing system at least by: detecting a failure of a first failure type in the computing system; casting the first failure type to a second failure type, different that the first failure type, wherein the second failure type has an associated failure recovery; and attempting to resolve the first failure type by using the failure recovery associated with the second failure type. 6. The device of claim 5 , wherein the attempting to resolve the first failure type by using the failure recovery associated with the second failure type occurs at boot and/or start-up time. 7. The device of claim 5 , wherein the computing system includes an array of devices and the first and second failure types are associated with failures of the array of devices. 8. The device of claim 5 , wherein the one or more processors are further configured to: use a failure casting hierarchy in a script that includes a set of non-reboot curable failures that are checked at boot time, and if a device exhibits a failure upon bootup within the set of non-reboot-curable failures, then the disk is not added to the array of devices. 9. A non-transitory computer readable storage medium storing at least executable code for managing failures in a computing system, wherein the executable code when executed at least: detects a failure of a first failure type in the computing system; casts the first failure type to a second failure type, different that the first failure type, wherein the second failure type has an associated failure recovery; and attempts to resolve the first failure type by using the failure recovery associated with the second failure type.

Assignees

Inventors

Classifications

  • in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

  • where the redundant component is memory or memory area · CPC title

  • Boot up procedures · CPC title

  • Management of state, configuration or failover · CPC title

  • by power-on test, e.g. power-on self test [POST] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9684554B2 cover?
A system and method for using failure casting to manage failures in computer system. In accordance with an embodiment, the system uses a failure casting hierarchy to cast failures of one type into failures of another type. In doing this, the system allows incidents, problems, or failures to be cast into a (typically smaller) set of failures, which the system knows how to handle. In accordance w…
Who is the assignee on this patent?
Teradata Us Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1666. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 20 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).