Managing high-availability file servers

US11770447B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11770447-B2
Application numberUS-201816177126-A
CountryUS
Kind codeB2
Filing dateOct 31, 2018
Priority dateOct 31, 2018
Publication dateSep 26, 2023
Grant dateSep 26, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems and computer program products for implementing high-availability file services in a clustered computing environment. Two or more clusters are interconnected to carry out operations for replication of file content between file servers. The file servers and their respective network links are registered with a file server witness. The file servers operate in synchrony, where each file I/O is replicated from one file server to another file server over a first set of network paths. A file server witness communicates with each file server using a second set of two or more network paths interfaced with respective file servers. The file server witness monitors the file servers to determine operational health of the file servers. Upon receipt of a file I/O request, the file I/O request is directed to one of the two file servers based at least in part on the determined operational health.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for implementing high-availability of at least two file servers in a clustered computing environment, the method comprising: synchronizing two file servers implemented as virtual machines to maintain synchronized file system content, wherein the two file servers correspond to a first cluster in a first failure domain having a first file server and a second cluster in a second failure domain having a second file server, the first file server designated as a primary file server, wherein the primary file server is the first file server of the two file servers to receive file I/O requests, and the second file server operating in a replication mode to the primary file server; interfacing a file server witness in a third failure domain with the two file servers; monitoring the two file servers to determine a first status indicator indicating an unhealthy condition in the first file server; designating the second file server as the primary file server based at least in part on the first status indicator; halting the synchronizing of the two files servers; determining a second status indicator indicating that the previously unhealthy condition of the first file server has been remediated; and upon remediation of the unhealthy condition, resuming synchronizing the two file servers with the first file server operating in replication mode to the second file server, and the second file server remaining as the primary file server. 2. The method of claim 1 , wherein (a) designating the second file server as the primary file server based at least in part on the first status indicator, and (b) determining a second status indicator indicating that the previously unhealthy condition of the first file server has been remediated, are performed by the file server witness. 3. The method of claim 2 , wherein the file server witness monitors the two file servers and determines the first status indicator. 4. The method of claim 1 , wherein designating the second file server as the primary file server involves an atomic operation. 5. The method of claim 1 , further comprising: issuing a synchronization control message, the synchronization control message being issued in response to the first status indicator. 6. The method of claim 5 , wherein the synchronization control message is issued to halt synchronization or resume synchronization. 7. The method of claim 1 , wherein the file server witness is implemented in a third cluster. 8. The method of claim 7 , wherein the first file server is implemented in a first availability zone, the second file server is implemented in a second availability zone, and the file server witness is implemented in a third availability zone. 9. The method of claim 1 , wherein the unhealthy condition in the first file server comprises at least one of an oversubscribed file server, a file server failure, a cluster failure, and a connection failure. 10. The method of claim 1 , further comprising: receiving a file I/O request from a host and first directing such file I/O request to the file server then-currently designated as the primary file server. 11. One or more non-transitory computer readable mediums having stored thereon a sequence of instructions which, when stored in memory and executed by one or more processors causes the one or more processors to perform a set of acts for implementing high-availability of at least two file servers in a clustered computing environment, the acts comprising: synchronizing two file servers implemented as virtual machines to maintain synchronized file system content, wherein the two file servers correspond to a first cluster in a first failure domain having a first file server and a second cluster in a second failure domain having a second file server, the first file server designated as a primary file server, wherein the primary file server is the first file server of the two file servers to receive file I/O requests, and the second file server operating in a replication mode to the primary file server; interfacing a file server witness in a third failure domain with the two file servers; monitoring the two file servers to determine a first status indicator indicating an unhealthy condition in the first file server; designating the second file server as the primary file server based at least in part on the first status indicator, halting the synchronizing of the two file servers; determining a second status indicator indicating that the previously unhealthy condition of the first file server has been remediated; and upon remediation of the unhealthy condition, resuming synchronizing the two file servers with the first file server operating in replication mode to the second file server, and the second file server remaining as the primary file server. 12. The computer readable medium of claim 11 , wherein (a) designating the second file server as the primary file server based at least in part on the first status indicator, and (b) determining a second status indicator indicating that the previously unhealthy condition of the first file server has been remediated, are performed by the file server witness. 13. The computer readable medium of claim 12 , wherein the file server witness monitors the two file servers and determines the first status indicator. 14. The computer readable medium of claim 11 , wherein designating the second file server as the primary file server involves an atomic operation. 15. The computer readable medium of claim 11 , further comprising instructions which, when stored in memory and executed by the processor causes the processor to perform acts of: issuing a synchronization control message, the synchronization control message being issued in response to the first status indicator. 16. The computer readable medium of claim 15 , wherein the synchronization control message is issued to halt synchronization or resume the synchronization. 17. The computer readable medium of claim 11 , wherein the file server witness is implemented in a third cluster. 18. The computer readable medium of claim 17 , wherein the first file server is implemented in a first availability zone, the second file server is implemented in a second availability zone, and the file server witness is implemented in a third availability zone. 19. A system for implementing high-availability of at least two file servers in a clustered computing environment, the system comprising: one or more storage mediums having stored thereon a sequence of instructions; and one or more processors that execute the instructions to cause the processor to perform a set of acts, the acts comprising, synchronizing two file servers implemented as virtual machines to maintain synchronized file system content, wherein the two file servers correspond to a first cluster in a first failure domain having a first file server and a second cluster in a second failure domain having a second file server, the first file server designated as a primary file server, wherein the primary file server is the first file server of the two file servers to receive file I/O requests, and the second file server operating in a replication mode to the primary file server; interfacing a file server witness in a third failure domain with the two file servers; monitoring the two file servers to determine a first status indicator indicating an unhealthy condition in the first file server; designating the second file server as the primary file server based at least in part on the first status indicator; halting the

Assignees

Inventors

Classifications

  • Assignment of logical groups to network elements · CPC title

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

  • Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title

  • H04L69/40Primary

    for recovering from a failure of a protocol instance or entity, e.g. service redundancy protocols, protocol state redundancy or protocol service redirection (management of faults, events, alarms or notifications in data switching networks H04L41/06) · CPC title

  • Configuration setting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11770447B2 cover?
Methods, systems and computer program products for implementing high-availability file services in a clustered computing environment. Two or more clusters are interconnected to carry out operations for replication of file content between file servers. The file servers and their respective network links are registered with a file server witness. The file servers operate in synchrony, where each …
Who is the assignee on this patent?
Nutanix Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/1095. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 26 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).