Nonvolatile media journaling of verified data sets

US9229809B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9229809-B2
Application numberUS-201113229736-A
CountryUS
Kind codeB2
Filing dateSep 11, 2011
Priority dateSep 11, 2011
Publication dateJan 5, 2016
Grant dateJan 5, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The storage of data sets in a storage set (e.g., data sets written to hard disk drives comprising a RAID array) may diminish the performance of the storage set through non-sequential writes, particularly if the storage devices promptly write data sets that are followed by sequentially following data sets. Additionally, storage sets may exhibit inconsistencies due to non-atomic writes of data sets and verifiers (e.g., checksums) and an intervening failure, such as an occurrence of the RAID write hole. Instead, data sets and verifiers may first be written to a stored on the nonvolatile media of a storage device before being committed to the storage set. Such writes may be sequentially written to the journal, irrespective of the locations of the data sets in the storage set; and recovery of a failure may simply involve re-committing the consistent records in the journal to correct incomplete writes to the storage set.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of storing data sets in a storage set provided by at least one storage device, the method involving a computer having a processor and comprising: executing, on the processor, instructions that cause the computer to: generate on a storage device a journal configured to store data sets respectively associated with a verifier; upon receiving a request to store a data set at a location in the storage set: compute a verifier for the data set; and store the verifier and the data set in the journal; select, from the journal, a batch comprising a first data set and a second data set to be committed to the storage set, such that writing the first data set and the second data set to the storage set together is faster than individually writing the first data set and the second data set to the storage set; before removing any of the data sets from the journal, for respective data sets of the batch, store the first data set, the verifier of the first data set, the second data set, and the second verifier of the second data set in the storage set; and only after storing all of the data sets of the batch in the storage set, remove the first data set and the second data set of the batch from the journal. 2. The method of claim 1 : the journal comprising a sequence of data sets; and storing a data set in the journal comprising: appending the data set to the sequence of data sets. 3. The method of claim 1 , wherein selecting the batch further comprises: selecting, from the journal, a first data set and a second data set to be stored in the storage set upon detecting a commit event selected from a commit event set comprising: a journal capacity event involving a capacity of the journal; a duration event involving a duration of the data sets stored in the journal; a commit request event involving a request to commit at least one data set in the journal to the storage set; and a storage device workload event involving a workload of at least one storage device of the storage set. 4. The method of claim 1 , selecting the batch of data sets comprising: selecting for inclusion in the batch a first data set stored in the journal and to be stored at a first location in the storage set; and selecting for inclusion in the batch a second data set stored in the journal and to be stored at a second location that is near the first location in the storage set. 5. The method of claim 1 , selecting the batch of data sets comprising: omitting from the batch a first data set stored in the journal and to be stored at a location in the storage set when the journal includes a second data set that is newer than the first data set and that is to be stored at the location in the storage set. 6. The method of claim 1 : respective requests specifying a location in the storage set for the data set; and computing the verifier for a data set comprising: for a data set that is completely recorded in the journal, computing the verifier from the data set; and for a data set that is not completely recorded in the journal: reading an original version of the data set at the location in the storage set; reading an original verifier of the original version of the data set; removing the original version of the data set from the original verifier; and including the data set in the original verifier. 7. The method of claim 1 , rebuilding the volatile memory representation of the journal comprising: for respective data sets stored in the journal, reading a location of the data set in the storage set; after reading the locations of the data sets, reading the data sets in the journal; and while reading the locations of the data sets, blocking requests from processes to access the storage set. 8. The method of claim 1 : the computer comprising a volatile memory; and the instructions configured to: generate in the volatile memory a volatile memory representation of the journal stored on the storage device; and upon storing a data set in the journal, store the data set in the volatile memory representation of the journal. 9. The method of claim 8 , the volatile memory representation of the journal indexed according to locations in the storage set where the data sets are to be stored. 10. The method of claim 8 , the instructions configured to, upon receiving a request to read a data set: upon determining that the data set is stored in the volatile memory representation of the journal in the volatile memory, present the data set stored in the volatile memory representation; upon determining that the data set is stored in the journal on the storage device, present the data set stored in the journal; and upon determining that the data set is stored in the storage set stored on the storage device, present the data set stored in the storage set of the storage device. 11. The method of claim 8 , the instructions configured to, after storing a batch to the storage set, remove the batch from the volatile memory representation of the journal. 12. The method of claim 11 : the instructions configured to: upon storing a data set in the journal, store the data set in the volatile memory representation of the journal marked as unremovable; and upon storing a data set in the storage set, mark the data set stored in the volatile memory representation of the journal as removable; and removing data sets from the volatile memory representation of the journal in the volatile memory comprising: removing from the volatile memory representation of the journal only data sets marked as removable. 13. The method of claim 11 : the computer comprising a write buffer associated with the journal stored on the storage device; the instructions configured to: detect commits of data sets from the write buffer to the storage set, and upon detecting a commit of a data set, mark the data set as committed in the volatile memory representation of the journal; and removing data sets from the journal comprising: removing from the journal only data sets marked as committed. 14. The method of claim 8 : the volatile memory representation of the journal having a capacity of data sets; and generating the volatile memory representation of the journal in the volatile memory comprising: reserving capacity in the volatile memory to store: the capacity of data sets in the volatile memory representation of the journal; the verifiers of the data sets comprising the capacity of the volatile memory representation of the journal; and a buffer configured to store data sets to be stored in the storage set while data sets stored in the journal are being stored in the storage set. 15. The method of claim 8 , the instructions configured to, upon recovering from a failure of the computer: using the journal, rebuild the volatile memory representation of the journal in the volatile memory of the computer; and after rebuilding the volatile memory representation of the journal in the volatile memory, reinitiate selecting batches of data sets stored in the journal to be stored in the storage set. 16. The method of claim 1 : the computer comprising a write buffer associated with the storage device storing the journal; and the instructions configured to, while writing data sets and verifiers to the journal, bypass the write buffer. 17. The method of claim 1 : the computer comprising a write buffer associated with the storage device storing the journal; and the instructions configured to, upon flushing the write buffer of the storage device storing the journal, identify a flush poin

Assignees

Inventors

Classifications

  • in relation to throughput · CPC title

  • involving logging of persistent data for recovery · CPC title

  • Parity data used in redundant arrays of independent storages, e.g. in RAID systems · CPC title

  • Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9229809B2 cover?
The storage of data sets in a storage set (e.g., data sets written to hard disk drives comprising a RAID array) may diminish the performance of the storage set through non-sequential writes, particularly if the storage devices promptly write data sets that are followed by sequentially following data sets. Additionally, storage sets may exhibit inconsistencies due to non-atomic writes of data se…
Who is the assignee on this patent?
Moss Darren, Mehra Karan, Nagar Rajeev, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F11/1076. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).