File system consistency in a distributed system using version vectors

US11775500B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11775500-B2
Application numberUS-201916700735-A
CountryUS
Kind codeB2
Filing dateDec 2, 2019
Priority dateSep 11, 2015
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for maintaining file system consistency in a distributed system using version vectors is presented. A method generally includes comparing incarnation and transaction identifiers of a current version vector associated with a file with incarnation and transaction identifiers of a last completed version vector associated with the file. Upon determining that a current version vector reflects operations on the file that are either earlier than or the same as the identifiers in the last completed version vector, the node performing one or more file system operations on the file.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining, by a first backup node from a storage repository, a current version vector associated with a file and a last completed version vector associated with the file, wherein the file is accessible by a distributed file system with a plurality of backup nodes that at least includes the first backup node and a second backup node, wherein the current version vector associated with the file is representative of a current operation being performed on the file, wherein the current version vector associated with the file is comprised of a backup node identifier associated with a backup node performing the current operation on the file, a first incarnation identifier associated with the backup node performing the current operations on the file, and a first transaction identifier, wherein the last completed version vector is representative of a most recent successfully completed operation performed on the file, wherein the last completed version vector associated with the file is comprised of a second node identifier associated with a backup node that performed the most recent successfully completed operation on the file, a second incarnation identifier associated with the backup node that performed the most recent successfully completed operation on the file, and a second transaction identifier, wherein the first incarnation identifier associated with the backup node performing the current operation on the file indicates a number of times the backup node performing the current operation on the file has been restarted and a first transaction identifier indicates a number of transactions performed by the backup node performing the current operation on the file while the backup node performing the current operation on the file has the first incarnation identifier; determining, by the first backup node, whether the second backup node is currently performing operations on the file at least in part by: comparing the first incarnation identifier associated with the backup node performing the current operations on the file of the current version vector of the file with the second incarnation identifier associated with the backup node that performed the most recent successfully completed operation on the file of the last completed version vector associated with the file; and comparing the first transaction identifier associated with the backup node performing the current operations on the file with the second transaction identifier associated with the backup node that performed the most recent successfully completed operation on the file of the last completed version vector associated with the file; in response to determining that the second backup node is currently performing operations on the file, waiting, by the first backup node, to access the file until the current version vector associated with the file is equal to the last completed version vector associated with the file; and in response to the current version vector associated with the file comprising the first incarnation identifier and the first transaction identifier being equal to the last completed version vector associated with the file comprising the second incarnation identifier and the second transaction identifier, accessing, by the backup node, the file at least in part by: extracting the second incarnation identifier associated with the backup node that performed the most recent successfully completed operation on the file from the last completed version vector associated with the file; generating an updated version vector prior to reading the file by incrementing the second incarnation identifier of the last completed version vector for subsequent file operations to be performed by the first backup node; committing to the storage repository the updated version vector to be the current version vector associated with the file; reading the file; determining that an inconsistency for the file exists; fixing the inconsistency in the file; and committing to the storage repository the updated version vector to be the last completed version vector associated with the file. 2. The method of claim 1 , wherein fixing the inconsistency in the file comprises rolling back the file to a previous stable version of the file. 3. The method of claim 1 , wherein the current version vector associated with the file is stored in a first data repository of the storage repository and the last completed version vector associated with the file is stored in a second data repository of the storage repository. 4. The method of claim 1 , wherein fixing the inconsistency in the file comprises attempting to update the file based on a cached copy of a file update. 5. The method of claim 1 , further comprising: restarting the first backup node; obtaining a corresponding incarnation identifier from a previous version vector associated with the first backup node; and resetting the corresponding incarnation identifier and a corresponding transaction identifier, wherein resetting includes incrementing the corresponding incarnation identifier relative to a previous incarnation identifier and setting the corresponding transaction identifier to an initial, sequential value. 6. The method of claim 1 , wherein the current version vector associated with the file and the last completed version vector associated with the file further comprises a unique identifier of the backup node that performed the most recent successfully completed operation on the file. 7. A computer program product, the computer program product being embodied in a non-transitory computer readable medium and comprising instructions for: obtaining, by a first backup node from a storage repository, a current version vector associated with a file and a last completed version vector associated with the file, wherein the file is accessible by a distributed file system with a plurality of backup nodes that at least includes the first backup node and a second backup node, wherein the current version vector associated with the file is representative of a current operation being performed on the file, wherein the current version vector associated with the file is comprised of a backup node identifier associated with a backup node performing the current operation on the file, a first incarnation identifier associated with the backup node performing the current operation on the file, and a first transaction identifier, wherein the last completed version vector is representative of a most recent successfully completed operation performed on the file, wherein the last completed version vector associated with the file is comprised of a second node identifier associated with a backup node that performed the most recent successfully completed operation on the file, a second incarnation identifier associated with the backup node that performed the most recent successfully completed operation, and a second transaction identifier, wherein the first incarnation identifier associated with the backup node performing the current operation on the file indicates a number of times the backup node performing the current operation on the file has been restarted and the first transaction identifier indicates a number of transactions performed by the backup node performing the current operation on the file while the backup node performing the current operation has the first incarnation identifier; determining, by the first backup node, whether the second backup node is currently performing operations on the file at least in part by: comparing the first incarnation identifier associated with the backup node performing the current operations on the file of the current version vector of the file with the second incarnation identifier associated with the backup node that perfo

Assignees

Inventors

Classifications

  • Ensuring data consistency and integrity · CPC title

  • Details of file system snapshots on the file-level, e.g. snapshot creation, administration, deletion (error detection or correction of the data by redundancy in operations or in hardware G06F11/14, G06F11/16) · CPC title

  • Concurrency control (transaction processing G06F9/466) · CPC title

  • Indexing; Web crawling techniques · CPC title

  • for networked environments · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11775500B2 cover?
A method and apparatus for maintaining file system consistency in a distributed system using version vectors is presented. A method generally includes comparing incarnation and transaction identifiers of a current version vector associated with a file with incarnation and transaction identifiers of a last completed version vector associated with the file. Upon determining that a current version…
Who is the assignee on this patent?
Cohesity Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).