NVRAM data organization using self-describing entities for predictable recovery after power-loss

US9619160B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9619160-B2
Application numberUS-201514839667-A
CountryUS
Kind codeB2
Filing dateAug 28, 2015
Priority dateJan 9, 2014
Publication dateApr 11, 2017
Grant dateApr 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a node coupled to a plurality of storage devices executes a storage input/output (I/O) stack having a plurality of layers including a persistence layer. A portion of non-volatile random access memory (NVRAM) is configured as one or more logs. The persistence layer cooperates with the NVRAM to employ the log to record write requests received from a host and to acknowledge successful receipt of the write requests to the host. The log has a set of entries, each entry including (i) write data of a write request and (ii) a previous offset referencing a previous entry of the log. After a power loss, the acknowledged write requests are recovered by replay of the log in reverse sequential order using the previous record offset in each entry to traverse the log.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a central processing unit (CPU) of a node coupled to storage devices of a storage array; a non-volatile random access memory (NVRAM) coupled to the CPU, a portion of the NVRAM configured as one or more logs; and a memory coupled to the CPU and configured to store a storage input/output (I/O) stack having a plurality of layers including a persistence layer executable by the CPU, the persistence layer cooperating with the NVRAM to employ the one or more logs to record in sequence write requests received from a host and to acknowledge successful receipt of the write requests to the host, the one or more logs having a set of entries, each entry including (i) write data of a write request, (ii) a previous offset referencing a previous entry of the log and (iii) and an outstanding enteries field indication a number of outstanding I/O operations in the log at the time that the respective entry is created, wherein after a power loss, the acknowledged write requests are recovered by replay of the one or more logs in reverse sequential order using the previous offset of each entry to traverse the log, wherein recovery of a last write request received from the host is enabled by the replay of the logs corresponding to write requests pending in the storage I/O stack at a time of creation of a tail entry of the log. 2. The system of claim 1 wherein each entry further includes a sequence number used to determine the tail entry of the log, wherein the tail entry is a last entry written to the log. 3. The system of claim 2 wherein the tail entry of the log is determined when the sequence number of the previous entry referenced by the previous offset in a current entry of the log is out of sequence with the sequence number of the current entry. 4. The system of claim 3 wherein the sequence number facilitates matching of entries in the set of entries to allow retirement of the entries when the write data is safely stored on the storage array. 5. The system of claim 4 wherein the sequence number is one of a monotonically increasing value and a time stamp. 6. The system of claim 2 wherein the plurality of layers of the storage I/O stack include a volume layer disposed over an extent store layer, and wherein the outstanding entries field of each entry further includes an outstanding-entries-to-volume-layer field indicating a number of write requests in progress at the volume layer. 7. The system of claim 6 wherein the outstanding-entries-to-volume-layer field is embodied as a counter that tracks a number of entries processed by the volume layer and the extent store layer of the storage I/O stack. 8. The system of claim 6 wherein the replay is performed by traversing entries of the log starting from the tail entry based on the number indicated in the outstanding-entries-to-volume-layer field of the tail entry. 9. The system of claim 8 wherein the replay is further performed by replaying the traversed entries in order from an oldest entry up to and including the tail entry. 10. A method comprising: executing, by a node coupled to a plurality of solid state drives (SSDs), a storage input/output (I/O) stack having a plurality of layers including a persistence layer for configuring a portion of a non-volatile random access memory (NVRAM) as a log to record in sequence write requests received from a host and to acknowledge successful receipt of the write requests to the host; organizing the log as a set of entries, each entry including (i) write data of a write request, (ii) a previous offset referencing a previous entry of the log and (iii) an outstanding entries field indicating a number of outstanding I/O operations in the log at the time that the respective entry is created; and after a power loss, replaying the log in reverse sequential order using the previous offset of each entry to traverse the log so as to recover the acknowledged write requests, wherein recovery of a last write request received from the host is enabled by the replay of log the logs corresponding to write requests pending in the storage I/O stack at a time of creation of a tail entry of the log. 11. The method of claim 10 further comprising: determining the tail entry of the log using a sequence number included in an entry, wherein the tail entry is a last entry written to the log. 12. The method of claim 11 wherein determining the trail entry comprises: determining the tail entry of the log when the sequence number of the previous entry referenced by the previous offset in a current entry of the log is out of sequence with the sequence number of the current entry. 13. The method of claim 12 further comprising: using the sequence number to facilitate matching of entries in the set of entries to allow retirement of the entries when the write data is safely stored on the SSDs. 14. The method of claim 11 further comprising: executing a volume layer disposed over an extent store layer of the storage I/O stack at the node; and indicating a number of write requests in progress at the volume layer in an outstanding-entries-to-volume-layer field of the entry. 15. The method of claim 14 further comprising: embodying the outstanding entries field as a counter; and tracking a number of entries processed by the volume layer and the extent store layer of the storage I/O stack using the counter. 16. The method of claim 14 wherein replaying further comprises: traversing entries of the log starting from the tail entry based on the number indicated in the outstanding-entries-to-volume-layer field of the tail entry. 17. The method of claim 16 wherein replaying further comprises: replaying the traversed entries in order from an oldest entry up to and including the tail entry. 18. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions when executed operable to: implement a storage input/output (I/O) stack having a plurality of layers including a persistence layer for configuring a portion of a non-volatile random access memory (NVRAM) as a log to record in sequence write requests received from a host and to acknowledge successful receipt of the write requests to the host; organize the log as a set of entries, each entry including (i) write data of a write request, (ii) a previous offset referencing a previous entry of the log and (iii) and an outstanding entries field indicating a number of outstanding I/O operations in the log at the time that the respective entry is created; and after a power loss, replay the log in reverse sequential order using the previous offset of each entry to traverse the log so as to recover the acknowledged write requests, wherein recovery of a last write request received from the host is enabled by the replay of the logs corresponding to write requests pending in the storage I/O stack at a time of creation of the tail entry of the log.

Assignees

Inventors

Classifications

  • in block erasable memory, e.g. flash memory · CPC title

  • Replication mechanisms · CPC title

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

  • Resetting or repowering · CPC title

  • G06F3/0619Primary

    in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9619160B2 cover?
In one embodiment, a node coupled to a plurality of storage devices executes a storage input/output (I/O) stack having a plurality of layers including a persistence layer. A portion of non-volatile random access memory (NVRAM) is configured as one or more logs. The persistence layer cooperates with the NVRAM to employ the log to record write requests received from a host and to acknowledge succ…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0619. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).