Techniques for automatically freeing space in a log-structured storage system
US-2016253104-A1 · Sep 1, 2016 · US
US9785366B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9785366-B1 |
| Application number | US-201514984864-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 30, 2015 |
| Priority date | Dec 30, 2015 |
| Publication date | Oct 10, 2017 |
| Grant date | Oct 10, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of writing data to persistent storage includes (a) for each data block of a set of data blocks, storing data of that data block at an offset within a log segment of the persistent storage in conjunction with a logical block address (LBA) of that data block on the persistent storage, a size of the log segment being larger than a size of each data block, (b) identifying a particular log segment of the persistent storage that has become filled with data blocks, and (c) upon identifying the particular log segment as having become filled, inserting pointers to respective data blocks stored within the particular log segment into respective locations defined by the respective LBA of each respective data block within a map tree.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method performed by a computing device of writing data to persistent storage, the method comprising: for each data block of a set of data blocks, storing data of that data block at an offset within a log segment of the persistent storage in conjunction with a logical block address (LBA) of that data block on the persistent storage, a size of the log segment being larger than a size of each data block; identifying a particular log segment of the persistent storage that has become filled with data blocks; upon identifying the particular log segment as having become filled, inserting pointers to respective data blocks stored within the particular log segment into respective locations defined by the respective LBA of each respective data block within a map tree. 2. The method of claim 1 wherein the method further comprises, for each data block of the set of data blocks, upon storing data of that data block on the persistent storage, storing a descriptor in cache memory of the computing device in conjunction with data of that data block, the descriptor including: the LBA of that data block on the persistent storage; and a physical address of that data block on the persistent storage, the physical address including an identifier of the log segment where the data of that data block is stored on the persistent storage and an offset within the log segment where the data of that data block is stored. 3. The method of claim 2 wherein the method further comprises: for each data block of the set of data blocks, upon storing data of that data block on the persistent storage, marking a dirty flag within the descriptor as dirty; and upon identifying the particular log segment as having become filled, marking the dirty flag as clean within the descriptor in cache memory of each respective data block within the particular log segment. 4. The method of claim 3 wherein: the set of data blocks includes data blocks each belonging to one of a plurality of streams, each stream of the plurality of streams having a respective open log segment to which data blocks belonging to that stream are stored; for each data block of the set of data blocks, storing the data of that data block at the offset within the log segment of the persistent storage includes: searching cache memory of the computing device for a descriptor having the LBA of that data block; if such a descriptor is found, then reading the dirty flag of that descriptor, and if the dirty flag is marked dirty, then reading the identifier of the log segment where the data of that data block is stored from the descriptor and storing data of that data block in the identified log segment; and otherwise, storing data of that data block to the respective open log segment of the stream to which that data block belongs. 5. The method of claim 4 wherein the plurality of streams includes: a first stream for data blocks that are written to in a random manner; and a second stream for data blocks that are written to in LBA sequence. 6. The method of claim 4 wherein the plurality of streams includes: a first stream for data blocks that are written to by a first application; and a second stream for data blocks that are written to by a second application, different than the first application. 7. The method of claim 4 wherein the plurality of streams includes: a first stream for data blocks that are written to by a first processing core; and a second stream for data blocks that are written to by a second processing core, different than the first processing core. 8. The method of claim 4 wherein: the plurality of streams includes: a first stream for data blocks that are written to by a first storage processor of the computing device; and a second stream for data blocks that are written to by a second storage processor of the computing device, different than the first storage processor, the first storage processor and the second storage processor each having their own respective separate cache memory; and storing the descriptor in cache memory of the computing device in conjunction with data of that data block includes storing the descriptor in cache memory of both the first storage processor and the second storage processor. 9. The method of claim 8 wherein: searching cache memory of the computing device for the descriptor having the LBA of that data block includes searching the cache memory of the one of the first storage processor and the second storage processor which is writing that data block; and reading the identifier of the log segment where the data of that data block is stored from the descriptor and storing data of that data block in the identified log segment includes: determining whether the identified log segment belongs to a stream for data blocks that are written to by the one of the first storage processor and the second storage processor which is writing that data block; and based on the determination, selectively: if the determination is affirmative, storing, by the one of the first storage processor and the second storage processor which is writing that data block, the data of that data block in the identified log segment; and if the determination is negative, sending, by the one of the first storage processor and the second storage processor which is writing that data block to the other of the first storage processor and the second storage processor, the data of that data block to be stored in the identified log segment by the other of the first storage processor and the second storage processor. 10. A computer program product comprising a non-transitory computer-readable storage medium storing a set of instructions, which, when executed by a computing device, causes the computing device to write data to persistent storage by performing the following operations: for each data block of a set of data blocks, storing data of that data block at an offset within a log segment of the persistent storage in conjunction with a logical block address (LBA) of that data block on the persistent storage, a size of the log segment being larger than a size of each data block; identifying a particular log segment of the persistent storage that has become filled with data blocks; upon identifying the particular log segment as having become filled, inserting pointers to respective data blocks stored within the particular log segment into respective locations defined by the respective LBA of each respective data block within a map tree. 11. An apparatus comprising: network interface circuitry for connecting to a host device configured to issue write commands for a set of data blocks; storage circuitry configured to interface with a persistent storage device; and processing circuitry coupled to memory configured to write data to the persistent storage device by performing the following operations: for each data block of the set of data blocks, storing data of that data block at an offset within a log segment of the persistent storage device in conjunction with a logical block address (LBA) of that data block on the persistent storage device, a size of the log segment being larger than a size of each data block; identifying a particular log segment of the persistent storage device that has become filled with data blocks; upon identifying the particular log segment as having become filled, inserting pointers to respective data blocks stored within the particular log segment into respective locations defined by the respective LBA of each respective data block within a map tree. 12. The computer program product of claim 10 wherein the set of instructions, when
Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title
Replication mechanisms · CPC title
Management of blocks · CPC title
Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory · CPC title
Improving I/O performance · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.