What technology area does this patent fall under?

Primary CPC classification G06F11/2092. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Implementing automatic switchover

US9836368B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9836368-B2
Application number	US-201514920334-A
Country	US
Kind code	B2
Filing date	Oct 22, 2015
Priority date	Oct 22, 2015
Publication date	Dec 5, 2017
Grant date	Dec 5, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One or more techniques and/or computing devices are provided for automatic switchover implementation. For example, a first storage controller, of a first storage cluster, may have a disaster recovery relationship with a second storage controller of a second storage cluster. In the event the first storage controller fails, the second storage controller may automatically switchover operation from the first storage controller to the second storage controller for providing clients with failover access to data previously accessible to the clients through the first storage controller. The second storage controller may detect, cross-cluster, a failure of the first storage controller utilizing remote direct memory access (RDMA) read operations to access heartbeat information, heartbeat information stored within a disk mailbox, and/or service processor traps. In this way, the second storage controller may efficiently detect failure of the first storage controller to trigger automatic switchover for non-disruptive client access to data.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: determining that a memory section is designated for heartbeat information exchange from a first storage controller within a first storage cluster to a second storage controller within a second storage cluster, the second storage controller configured as a disaster recovery partner for the first storage controller; performing a remote direct memory access read operation to access the memory section for obtaining a current heartbeat status of the first storage controller; determining that the current heartbeat status indicates a failure of the first storage controller; sending a communication signal from the second storage controller to the first storage cluster; initiating an automatic switchover operation from the first storage controller to the second storage controller for providing clients with failover access to data previously accessible to the clients through the first storage controller before switchover based upon responsiveness to the communication signal indicating that the failure is not a false trigger; and refraining from initiating the automatic switchover operation based upon the responsiveness to the communication signal indicating that the failure is the false trigger. 2. The method of claim 1 , wherein the current heartbeat status specifies a storage controller reboot as the failure. 3. The method of claim 1 , wherein the current heartbeat status specifies a state transition of the first storage controller. 4. The method of claim 1 , wherein the heartbeat information exchange corresponds to a series of sequence numbers used to indicate progress of the first storage controller. 5. The method of claim 1 , wherein the current heartbeat status specifies a software panic. 6. The method of claim 1 , comprising: determining that the failure is not the false trigger; initiating a manual switchover operation and not the automatic switchover operation based upon a determination that storage and a main controller of the first storage system are not available; and initiating the automatic switchover operation based upon a determination that the storage and the main controller are available. 7. The method of claim 1 , comprising: initiating the automatic switchover operation based upon a write caching synchronization state between the first storage controller and the second storage controller indicating a synchronous state; and refraining from initiating the automatic switchover operation based upon the write caching synchronization state indicating a non-synchronous state. 8. The method of claim 7 , comprising: reading the write caching synchronization state from a first disk mailbox of the first storage controller. 9. The method of claim 1 , wherein the first storage cluster is configured according to a single controller cluster configuration and the second storage cluster is configured according to the single controller cluster configuration. 10. The method of claim 1 , comprising: specifying that a first disk mailbox is to be used for heartbeat information exchange from the first storage controller to the second storage controller; reading a second current heartbeat status from the first disk mailbox; and initiating the automatic switchover operation based upon both the current heartbeat status and the second current heartbeat status indicating the failure. 11. The method of claim 10 , comprising: determining the failure as a power loss failure based upon both the current heartbeat status and the second current heartbeat status indicating the failure. 12. The method of claim 10 , comprising: initiating the automatic switchover operation after a threshold timeout based upon both the current heartbeat status and the second current heartbeat status indicating the failure. 13. A non-transitory machine readable medium having stored thereon instructions for performing a method comprising machine executable code which when executed by at least one machine, causes the machine to: determine that a first disk mailbox and a memory section are designated to be used for heartbeat information exchange from a first storage controller within a first storage cluster to a second storage controller within a second storage cluster, the second storage controller configured as a disaster recovery partner for the first storage controller; read a current heartbeat status from the first disk mailbox; perform a remote direct memory access read operation to access the memory section for obtaining a second current heartbeat status of the first storage controller; and initiate an automatic switchover operation from the first storage controller to the second storage controller for providing clients with failover access to data previously accessible to the clients through the first storage controller before switchover based upon both the current heartbeat status and the second current heartbeat status indicating a failure. 14. The non-transitory machine readable medium of claim 13 , wherein the current heartbeat status specifies a storage controller halt. 15. The non-transitory machine readable medium of claim 13 , wherein the machine executable code causes the machine to: initiate the automatic switchover operation after a timeout. 16. The non-transitory machine readable medium of claim 13 , wherein the machine executable code causes the machine to: send a communication signal from the second storage controller to the first storage cluster; evaluate responsiveness to the communication signal to determine whether the failure is a false trigger; initiate the automatic switchover operation based upon a determination that the failure is not the false trigger; and refrain from initiating the automatic switchover operation based upon a determination that the failure is the false trigger. 17. The non-transitory machine readable medium of claim 16 , wherein the machine executable code causes the machine to: determine whether storage and a main controller of the first storage cluster are available based upon the determination that the failure is not the false trigger; initiate a manual switchover operation and not the automatic switchover operation based upon the storage and the main controller not being available; and initiate the automatic switchover operation based upon the storage and the main controller being available. 18. The non-transitory machine readable medium of claim 13 , wherein the machine executable code causes the machine to: evaluate a write caching synchronization state between the first storage controller and the second storage controller; initiate the automatic switchover operation based upon the write caching synchronization state indicating a synchronous state; and refrain from initiating the automatic switchover operation based upon the write caching synchronization state indicating a non-synchronous state. 19. A computing device comprising: a memory containing machine readable medium comprising machine executable code having stored thereon instructions for performing a method; and a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to: determine that a memory section has been designated for heartbeat information exchange from a first storage controller within a first storage cluster to a second storage controller within a second storage cluster, the second storage controller configured as a disaster recovery partner for the first storage controller; perform a remote direct memor

Assignees

Inventors

Classifications

G06F11/2092Primary
Techniques of failing over between control units · CPC title
G06F11/0757
by exceeding a time limit, i.e. time-out, e.g. watchdogs · CPC title
G06F11/2069Primary
Management of state, configuration or failover · CPC title
G06F11/0727
in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title
G06F2201/805
Real-time · CPC title

Patent family

Related publications grouped by family.

View patent family 57241179

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9836368B2 cover?: One or more techniques and/or computing devices are provided for automatic switchover implementation. For example, a first storage controller, of a first storage cluster, may have a disaster recovery relationship with a second storage controller of a second storage cluster. In the event the first storage controller fails, the second storage controller may automatically switchover operation from…
Who is the assignee on this patent?: Netapp Inc, Netapp Inc
What technology area does this patent fall under?: Primary CPC classification G06F11/2092. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).