Transaction log layout for efficient reclamation and recovery

US9952765B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9952765-B2
Application numberUS-201514876572-A
CountryUS
Kind codeB2
Filing dateOct 6, 2015
Priority dateOct 1, 2015
Publication dateApr 24, 2018
Grant dateApr 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access memory (NVRAM) embodied as a NVlog and the second stage is disk, e.g., solid state drive (SSD). During crash recovery, the log entries are examined for consistency and scanned to identify those entries that have completed and those that are active, which require replay. The log entries are walked from oldest to newest (using sequence numbers) searching for the highest sequence number. Partially complete log entries (e.g., log entries in-progress when a crash occurs) may be discarded for failing a checksum (e.g., a CRC error). Old value/new value logs may be used to implement roll-forward or roll-back semantics to replay the log entries and fix any on-disk data structures, first from NVRAM and then from on-disk logs.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving at a storage system an input/output (I/O) request, wherein the storage system includes a persistent memory coupled to a central processor (CPU) and one or more storage devices; associating a transaction with the I/O request; allocating a circular log from the persistent memory to a first finite state machine (FSM) for processing the transaction, the first FSM associated with a token identifier (ID); logging a start entry in the circular log; in response to processing the transaction, recording one or more entries to the circular log, each recorded entry including a sequence number and a token ID field; in response to a crash of the storage system, scanning the recorded entries of the circular log to determine whether the first FSM is active at a time of the crash; and in response to determining that the first FSM is active at the time of the crash, replaying the recorded entries of the circular log having the token ID as a value of the token ID field. 2. The method of claim 1 wherein scanning the recorded entries further comprises: discarding the recorded entries failing a checksum. 3. The method of claim 2 wherein discarding the recorded entries failing the checksum further comprises: scanning the circular log to identify a magic number so as to determine a next recorded entry, wherein each recorded entry includes the magic number. 4. The method of claim 1 wherein scanning the recorded entries further comprises: adding the first FSM and a second FSM to a shadow structure, the second FSM associated with a sentinel value indicating that the second FSM is closed; and removing each recorded entry having the sentinel value as the value of the token ID field. 5. The method of claim 1 wherein scanning the recorded entries further comprises: making a first pass through the recorded entries to identify the first FSM as active at the time of the crash; and making a second pass through the recorded entries to determine a highest sequence number of the recorded entries. 6. The method of claim 5 wherein making the second pass through the recorded entries further comprises: determining whether a current sequence number of a respective entry of the recorded entries is lower than a previous sequence number of a prior respective entry; and in response to determining that the current sequence number of the respective entry of the recorded entries is lower than the previous sequence number of the prior respective entry, assigning the previous sequence number as the highest sequence number. 7. The method of claim 1 wherein the persistent memory is one of a non-volatile random access memory and a solid state drive. 8. The method of claim 1 wherein replaying the recorded entries of the circular log further comprises: applying one of an old value log and a new value log to each replay entry of the circular log. 9. The method of claim 8 wherein applying the old value log further comprises: restarting the first FSM. 10. The method of claim 8 wherein applying the new value log further comprises: applying the new value log to a first replay entry of the circular log; and ensuring that replay of first replay entry is idempotent. 11. A system comprising: a storage array having one or more storage devices; and a node having a central processor (CPU) connected to a memory, a persistent memory and the storage array, the CPU configured to execute one or more processes stored in the memory, the one or more processes configured to: receive at a storage system an input/output (I/O) request; associate a transaction with the I/O request; allocate a circular log from the persistent memory to a first finite state machine (FSM) for processing the transaction, the first FSM associated with a token identifier (ID); log a start entry in the circular log; in response to processing the transaction, record one or more entries to the circular log, each recorded entry including a sequence number and a token ID field; in response to a crash of the storage system, scan the recorded entries of the circular log to determine whether the first FSM is active at a time of the crash; and in response to determining that the first FSM is active at the time of the crash, replay the recorded entries of the circular log having the token ID as a value of the token ID field. 12. The system of the claim 11 wherein the one or more processes configured to scan the recorded entries is further configured to: discard the recorded entries failing a checksum. 13. The system of claim 12 wherein the one or more processes configured to discard the recorded entries failing the checksum is further configured to: scan the circular log to identify a magic number so as to determine a next recorded entry, wherein each recorded entry includes the magic number. 14. The system of claim 11 wherein the one or more processes configured to scan the recorded entries is further configured to: add the first FSM and a second FSM to a shadow structure, the second FSM associated with a sentinel value indicating that the second FSM is closed; and remove each recorded entry having the sentinel value as the value of the token ID field. 15. The system of claim 11 wherein the one or more processes configured to scan the recorded entries is further configured to: make a first pass through the recorded entries to identify the first FSM as active at the time of the crash; and make a second pass through the recorded entries to determine a highest sequence number of the recorded entries. 16. The system of claim 15 wherein the one or more processes configured to make the second pass through the recorded entries is further configured to: determine whether a current sequence number of a respective entry of the recorded entries is lower than a previous sequence number of a prior respective entry; and in response to determining that the current sequence number of the respective entry of the recorded entries is lower than the previous sequence number of the prior respective entry, assign the previous sequence number as the highest sequence number. 17. The system of claim 11 wherein the persistent memory is one of a non-volatile random access memory and a solid state drive. 18. The system of claim 11 wherein the one or more processes configured to replay the recorded entries of the circular log is further configured to: apply one of an old value log and a new value log to each replay entry of the circular log. 19. The system of claim 18 wherein the one or more processes configured to apply the new value log is further configured to: apply the new value log to a first replay entry of the circular log; and ensure that replay of first replay entry is idempotent. 20. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions configured to: implement a storage input/output (I/O) stack that operates with a persistent memory coupled to the one or more processors and with one or more solid state drives (SSDs); receive an I/O request; associate a transaction with the I/O request; allocate a circular log from the persistent memory to a finite state machine (FSM) for processing the transaction, the FSM associated with a token identifier (ID); log a start entry in the circular log; in response to processing the transaction, record one or more entries to the circular log, each recorded entry including a sequence

Assignees

Inventors

Classifications

  • G06F3/0604Primary

    Improving or facilitating administration, e.g. storage management · CPC title

  • Active fault masking without idle spares · CPC title

  • in transactions (updating of structured data in databases G06F16/23) · CPC title

  • Real-time · CPC title

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9952765B2 cover?
A layout of a transaction log enables efficient logging of metadata into entries of the log, as well as efficient reclamation and recovery of the log entries by a volume layer of a storage input/output (I/O) stack executing on one or more nodes of a cluster. The transaction log is illustratively a two stage, append-only logging structure, wherein the first level is non-volatile random access me…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0604. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).