Transaction model for data stores using distributed file systems
US-9582520-B1 · Feb 28, 2017 · US
US11409705B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11409705-B2 |
| Application number | US-201916517436-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 19, 2019 |
| Priority date | Aug 26, 2013 |
| Publication date | Aug 9, 2022 |
| Grant date | Aug 9, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the disclosure provide techniques managing a log-structured solid state drive (SSD) format in a distributed storage system. SSDs in the distributed storage system maintains a journal of logical changes to storage objects to persist prepared and committed changes in the latency path. The journal includes metadata entries that describe changes and reference data pages. Dense data structures (such as a logical block addressing table) index the metadata entries. To reduce the amount of overhead in I/O operations, the distributed storage system maintains the dense data structures in memory rather than on disk.
Opening claim text (preview).
We claim: 1. A storage system, comprising: a disk group comprising: at least one first non-volatile storage disk storing a plurality of objects; and at least one second non-volatile storage disk storing a journal for the disk group, the journal maintaining one or more entries corresponding to one or more logical changes to one or more objects of the plurality of objects stored on the at least one first non-volatile storage disk, wherein each entry comprises a metadata record describing at least one logical change to at least one object of the plurality of objects; a memory storing one or more data structures including a first data structure corresponding to the journal and a first operation type; and at least one processor configured to: upon an operation occurring on a block associated with an object of the plurality of objects stored on the at least one first non-volatile storage disk, insert a first entry corresponding to the operation into the journal, the first entry comprising a first metadata record describing a first logical change to the object based on the operation; upon determining an operation type of the first metadata record is the first operation type, insert the first metadata record into the first data structure; and upon receiving a read operation, determine from which one of the at least one first non-volatile storage disk and the at least one second non-volatile storage disk to read data based on the first data structure. 2. The storage system of claim 1 , wherein the at least one processor is further configured to upon determining an operation type of a second metadata record is a second operation type, insert the second metadata record into a second data structure corresponding to the journal and the second operation type. 3. The storage system of claim 2 , wherein the first operation type comprises a commit operation type, and wherein the second operation type comprises an overwrite or abort operation type. 4. The storage system of claim 1 , wherein the at least one second non-volatile storage disk serves as at least one of a cache or buffer for the at least one first non-volatile storage disk. 5. The storage system of claim 1 , wherein the at least one processor is further configured to upon determining the first entry corresponds to an overwritten block, remove a second metadata record corresponding to the overwritten block from the first data structure. 6. The storage system of claim 1 , wherein the at least one processor is further configured to: upon detecting that the one or more data structures need to be rebuilt, determine a corresponding operation type of each corresponding metadata record of each corresponding entry of the one or more entries of the journal; and insert each corresponding metadata record into one of the one or more data structures based on the corresponding operation type of the corresponding metadata record. 7. The storage system of claim 6 , wherein the at least one processor is further configured to while inserting each corresponding metadata record into one of the one or more data structures, upon determining that a second metadata record corresponds to an overwritten block, remove a previous metadata record corresponding to the overwritten block from the corresponding data structure of the one or more data structures. 8. A method of operating a storage system, comprising: wherein the storage system comprises a disk group comprising: at least one first non-volatile storage disk storing a plurality of objects; and at least one second non-volatile storage disk storing a journal for the disk group, the journal maintaining one or more entries corresponding to one or more logical changes to one or more objects of the plurality of objects stored on the at least one first non-volatile storage disk, wherein each entry comprises a metadata record describing at least one logical change to at least one object of the plurality of objects; maintaining, in a memory, one or more data structures including a first data structure corresponding to the journal and a first operation type; upon an operation occurring on a block associated with an object of the plurality of objects stored on the at least one first non-volatile storage disk, inserting a first entry corresponding to the operation into the journal, the first entry comprising a first metadata record describing a first logical change to the object based on the operation; upon determining an operation type of the first metadata record is the first operation type, inserting the first metadata record into the first data structure; and upon receiving a read operation, determining from which one of the at least one first non-volatile storage disk and the at least one second non-volatile storage disk to read data based on the first data structure. 9. The method of claim 8 , further comprising upon determining an operation type of a second metadata record is a second operation type, inserting the second metadata record into a second data structure corresponding to the journal and the second operation type. 10. The method of claim 9 , wherein the first operation type comprises a commit operation type, and wherein the second operation type comprises an overwrite or abort operation type. 11. The method of claim 8 , wherein the at least one second non-volatile storage disk serves as at least one of a cache or buffer for the at least one first non-volatile storage disk. 12. The method of claim 8 , further comprising upon determining the first entry corresponds to an overwritten block, removing a second metadata record corresponding to the overwritten block from the first data structure. 13. The method of claim 8 , further comprising: upon detecting that the one or more data structures need to be rebuilt, determining a corresponding operation type of each corresponding metadata record of each corresponding entry of the one or more entries of the journal; and inserting each corresponding metadata record into one of the one or more data structures based on the corresponding operation type of the corresponding metadata record. 14. The method of claim 13 , further comprising while inserting each corresponding metadata record into one of the one or more data structures, upon determining that a second metadata record corresponds to an overwritten block, removing a previous metadata record corresponding to the overwritten block from the corresponding data structure of the one or more data structures. 15. The method of claim 8 , wherein the first entry comprises sequence data indicating a sequence of the first entry in the journal. 16. A non-transitory computer readable medium comprising instructions that when executed by at least one processor, cause the at least one processor to perform a method of operating a storage system, the method comprising: wherein the storage system comprises a disk group comprising: at least one first non-volatile storage disk storing a plurality of objects; and at least one second non-volatile storage disk storing a journal for the disk group, the journal maintaining one or more entries corresponding to one or more logical changes to one or more objects of the plurality of objects stored on the at least one first non-volatile storage disk, wherein each entry comprises a metadata record describing at least one logical change to at least one object of the plurality of objects; maintaining, in a memory, one or more data structures including a first data structure corresponding to the journal and a first operation type; upon an operation occurring on a block associated with an
Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs · CPC title
involving logging of persistent data for recovery · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.