Logging and update of metadata in a log-structured file system for storage node recovery and restart

US10949312B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10949312-B2
Application numberUS-201916394642-A
CountryUS
Kind codeB2
Filing dateApr 25, 2019
Priority dateSep 21, 2018
Publication dateMar 16, 2021
Grant dateMar 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A technique is configured to log and update metadata in a log-structured file system to facilitate recovery and restart in response to failure of a storage node of a cluster. A block identifier (ID) is used to identify a block of data serviced by the storage node. Metadata embodied as mappings between block IDs and locations of data blocks in the cluster are illustratively maintained in “active” and “frozen” map fragments. An active map fragment refers to a map fragment that has space available to store a mapping, whereas a frozen map fragment refers to a map fragment that no available space for storing a mapping. The active map fragments are maintained in memory as “in-core” data structures, whereas the frozen map fragments are paged-out and stored on storage devices of the cluster as “on-disk” map fragment structures. Each frozen map fragment written to a segment includes a pointer to a last written frozen map fragment to form a chain (e.g., linked-list) of on-disk frozen map fragments. Each time a data block is persisted on a segment of the storage devices, an active map fragment is populated in-core and a metadata write marker is recorded on the segment (on-disk) indicating the location of the data block that was written to the segment. If a storage node crashes when the active map fragment is only partially populated, the metadata write markers facilitate rebuild of the active map fragment upon recovery and restart of a storage service of the node.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for recovery of metadata stored by a storage service having persistent storage, the method comprising: reading a filter from the persistent storage, the filter associating a map fragment and a first block identifier (block ID) of a first data block on the persistent storage; loading the map fragment from the persistent storage using the filter, the map fragment including a mapping of the first block ID to the map fragment associated with a first sublist, wherein the first sublist is based on a field of the first block ID; and reconstructing a first part of a search data structure from the first sublist such that the map fragment is indexed by the first sublist into the search data structure. 2. The method of claim 1 , wherein the filter is a Bloom filter. 3. The method of claim 1 , further comprising: replaying a metadata write marker written to the persistent storage, the metadata write marker including a mapping of a second block ID to a second data block on the persistent storage, wherein a second sublist is based on the field of the second block ID; rebuilding an active map fragment associated with the second sublist; reconstructing a second part of the search data structure from the second sublist such that the active map fragment is indexed by the second sublist into the search data structure. 4. The method of claim 1 , further comprising starting at an end of a log to read the filter from the persistent storage. 5. The method of claim 4 wherein the log is traversed backwards. 6. The method of claim 1 further comprising traversing remaining filters from the persistent storage as a linked list to rebuild associations of the first sublist. 7. The method of claim 1 further comprising ignoring a partially written map fragment on the persistent storage. 8. The method of claim 1 wherein the search data structure is a binary retrieval tree. 9. The method of claim 1 further comprising reconstructing a remaining portion of the search data structure by traversing remaining filters and remaining map fragments from the persistent storage. 10. The method of claim 1 wherein the first data block, the filter and the map fragment are stored on a segment of the persistent storage. 11. A system comprising: a storage node coupled to one or more storage devices; a memory coupled to a processor of the storage node executing a storage service to recover metadata, the storage service configured to: read a filter from the persistent storage, the filter associating a map fragment and a first block identifier (block ID) of a first data block on the persistent storage; load the map fragment from the persistent storage using the filter, the map fragment including a mapping of the first block ID to the map fragment associated with a first sublist, wherein the first sublist is based on a field of the first block ID; and reconstruct a first part of a search data structure from the first sublist such that the map fragment is indexed by the first sublist into the search data structure. 12. The system of claim 11 wherein the filter is a Bloom filter. 13. The system of claim 11 wherein the storage service is further configured to: replay a metadata write marker written to the persistent storage, the metadata write maker including a mapping of a second block ID to a second data block on the persistent storage, wherein a second sublist is based on the field of the second block ID; rebuild an active map fragment associated with the second sublist; and reconstruct a second part of the search data structure from the second sublist such that the active map fragment is indexed by the second sublist into the search data structure. 14. The system of claim 11 wherein the storage service is further configured to start at an end of a log to read the filter from the persistent storage. 15. A non-transitory computer readable medium containing executable program instructions for execution by a storage service having persistent storage recovery of metadata, comprising: reading a filter from the persistent storage, the filter associating a map fragment and a first block identifier (block ID) of a first storage block on the persistent storage; loading the map fragment from the persistent storage using the filter, the map fragment including a mapping of the first block ID to the map fragment associated with a first sublist, wherein the first sublist is based on a field of the first block ID; and reconstructing a first part of a search data structure from the first sublist such that the map fragment is indexed by the first sublist into the search data structure. 16. The system of claim 11 wherein the storage service is further configured to traverse remaining filters from the persistent storage as a linked list to rebuild associations of the first sublist. 17. The system of claim 11 wherein the storage service is further configured to ignore a partially written map fragment on the persistent storage. 18. The system of claim 11 wherein the search data structure is a binary retrieval tree. 19. The system of claim 11 wherein the storage service is further configured to reconstruct a remaining portion of the search data structure by traversing remaining filters and remaining map fragments from the persistent storage. 20. The system of claim 14 wherein the log is traversed backwards.

Assignees

Inventors

Classifications

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Backup restoration techniques · CPC title

  • Logical to physical mapping or translation of blocks or pages · CPC title

  • Non-volatile semiconductor memory arrays · CPC title

  • Configuration or reconfiguration · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10949312B2 cover?
A technique is configured to log and update metadata in a log-structured file system to facilitate recovery and restart in response to failure of a storage node of a cluster. A block identifier (ID) is used to identify a block of data serviced by the storage node. Metadata embodied as mappings between block IDs and locations of data blocks in the cluster are illustratively maintained in “active…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1469. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).