Determine failed components in fault-tolerant memory

US10664369B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10664369-B2
Application numberUS-201515500063-A
CountryUS
Kind codeB2
Filing dateJan 30, 2015
Priority dateJan 30, 2015
Publication dateMay 26, 2020
Grant dateMay 26, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an example, a failed component in a fault-tolerant memory fabric may be determined by transmitting request packets along a plurality of routes between the redundancy controller and a media controller in periodic cycles. The redundancy controller may determine whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles. In response to determining that route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles, the media controller is established as failed. In response to determining that route failures for less than all of the plurality of routes have occurred within the number of consecutive periodic cycles, a fabric device is established as failed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for determining failed component in a fault-tolerant memory fabric, the method comprising: transmitting, by a redundancy controller from a plurality of redundancy controllers in the memory fabric, request packets along a plurality of routes between the redundancy controller and a media controller in periodic cycles; determining whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles, wherein a route failure is determined to have occurred if a response packet to a request packet is not received along the same route on which the request packet was transmitted within a periodic cycle; in response to determining that route failures for all of the plurality of routes have occurred within the number of consecutive periodic cycles, establishing that the media controller has failed; and in response to determining that route failures for less than all of the plurality of routes have occurred within the number of consecutive periodic cycles, establishing that a fabric device has failed. 2. The method of claim 1 , wherein establishing that the media controller has failed comprises: entering the redundancy controller into a degraded mode, wherein the degraded mode allows continued access to data previously stored on the failed media controller through use of redundant data stored on other media controllers. 3. The method of claim 1 , wherein establishing that the fabric device has failed comprises: transmitting request packets along remaining routes that are functional; and monitoring for route failures in the remaining functional routes. 4. The method of claim 3 , wherein monitoring for route failures in the remaining functional routes comprises: shutting down the redundancy controller in response to a determination from the monitoring that route failures in the remaining functional routes have occurred during a periodic cycle subsequent to the number of consecutive periodic cycles. 5. The method of claim 4 , further comprising: reenable the use of routes through the fabric devices that are dependent upon the redundancy controller after the memory fabric is repaired. 6. The method of claim 1 , wherein each of the plurality of redundancy controllers transmit request packets along a plurality of routes between each of the redundancy controllers and the media controller in periodic cycles, and each of the plurality of redundancy controllers determine whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles. 7. The method of claim 6 , further comprising: in response to each of the plurality of redundancy controllers determining that route failures for all of the plurality of routes have occurred within the number of consecutive periodic cycles, transitioning each of the plurality of redundancy controllers to a degraded mode to prevent silent data corruption. 8. The method of claim 1 , wherein the request packets and the response packets are ping packets. 9. A redundancy controller to determine a failed component a fault-tolerant memory fabric, the redundancy controller comprising: a packet module to send request ping-packets along a plurality of routes between the redundancy controller and a media controller in periodic cycles; a determination module to determine whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles, wherein a route failure is determined to have occurred if a ping-response packet to a ping-request packet is not received along the same route on which the request packet was transmitted within a periodic cycle; and a designation module to designate a failed media controller in response to determining that route failures for all of the plurality of routes have occurred within the number of consecutive periodic cycles, and designate a failed fabric device in response to determining that route failures for less than all of the plurality of routes have occurred within the number of consecutive periodic cycles. 10. The redundancy controller of claim 9 , wherein to designate a failed media controller, the designation module is to: enter a degraded mode, wherein the degraded mode allows continued access to data previously stored on the failed media controller through use of redundant data stored on other media controllers. 11. The redundancy controller of claim 9 , wherein to designate a failed fabric device, the designation module is to: transmit request ping-packets along the remaining routes that are functional; and monitor for route failures in the remaining functional routes. 12. The redundancy controller of claim 11 , wherein to monitor for route failures in the remaining functional routes, designation module is to: shut down the redundancy controller in response to a determination from the monitoring that route failures in the remaining functional routes have occurred during a periodic cycle subsequent to the number of consecutive periodic cycles; and reenable the use of routes through fabric devices that are dependent upon the redundancy controller after the memory fabric is repaired. 13. A non-transitory computer readable medium to determine a failed component in a fault-tolerant memory fabric, including machine readable instructions executable by a processor to: transmit request packets along a plurality of routes between a redundancy controller and a media controller in periodic cycles; determine whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles, wherein a route failure is determined to have occurred if a response packet to a request packet is not received along the same route on which the request packet was transmitted within a periodic cycle; in response to determining that route failures for all of the plurality of routes have occurred within the number of consecutive periodic cycles, establish that the media controller has failed, and activate a degraded mode; in response to determining that route failures for less than all of the plurality of routes have occurred within the number of consecutive periodic cycles, establish that a fabric device has failed, and monitor for route failures in remaining functional routes. 14. The non-transitory computer readable medium of claim 13 , wherein to monitor for route failures in the remaining functional routes, the machine readable instructions are executable by the processor to: shut down the redundancy controller in response to a determination from the monitoring that route failures in the remaining functional routes have occurred during a periodic cycle subsequent to the number of consecutive periodic cycles. 15. The non-transitory computer readable medium of claim 13 , wherein in response the redundancy controller activating the degraded mode, the machine readable instructions are executable by the processor to prevent silent data corruption in the fault-tolerant memory.

Assignees

Inventors

Classifications

  • Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title

  • Degraded mode, e.g. caused by single or multiple storage removals or disk failures · CPC title

  • Solving problems relating to consistency · CPC title

  • Real-time · CPC title

  • Techniques of failing over between control units · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10664369B2 cover?
According to an example, a failed component in a fault-tolerant memory fabric may be determined by transmitting request packets along a plurality of routes between the redundancy controller and a media controller in periodic cycles. The redundancy controller may determine whether route failures for all of the plurality of routes have occurred within a number of consecutive periodic cycles. In r…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F11/2092. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 26 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).