Storage medium storing control program, method of controlling information processing device, information processing system, and information processing device
US-2015143032-A1 · May 21, 2015 · US
US9367448B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9367448-B1 |
| Application number | US-201313909875-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 4, 2013 |
| Priority date | Jun 4, 2013 |
| Publication date | Jun 14, 2016 |
| Grant date | Jun 14, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A garbage collector of a storage system traverses a namespace of a file system of the storage system to verify data integrity of segments. The namespace identifies files that are represented by segments arranged in multiple levels in a hierarchy, where an upper level segment includes one or more references to one or more lower level segments, and at least one segment is referenced by multiple files. Traversing the namespace includes computing and verifying checksums all segments in a level-by-level manner, where checksums of an upper level are verified before any of checksums of a lower level are verified. Upon all checksums of all levels have been verified, a garbage collection process is performed on the segments stored in the storage system.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of verifying data integrity for garbage collection, the method comprising: traversing, by a garbage collector executed by a processor, a namespace of a file system of a storage system to verify data integrity of segments, the namespace identifying a plurality of files that are represented by a plurality of segments arranged in a plurality of levels in a hierarchy, wherein an upper level segment includes one or more references to one or more lower level segments, wherein at least one segment is referenced by multiple files, wherein traversing the namespace includes verifying data integrity for all segments in a level-by-level manner comprising: verifying data integrity from checksums for an upper level segment, and verifying, in response to data integrity verification of the upper level segment, data integrity of a lower level segment from a lower level segment checksum, for each of the levels in the hierarchy, iteratively performing the following: obtaining fingerprints of all segments of a current level, computing checksums from the fingerprints of all segments of the current level, and adding the checksums of the current level to a parent checksum of the current level, computing checksums from the fingerprints of the current level that are read from the storage device, adding the checksums of the current level to a child checksum of the current level, for each of the fingerprints of the current level, marking a corresponding bit of a walk vector to indicate that a corresponding segment has been processed, wherein the walk vector includes a plurality of bits, each bit corresponding to one of the segments in the namespace, for each of the segments of the current level, retrieving a fingerprint of the segment from its storage location of a storage device, and marking a corresponding bit of a read vector to indicate that the segment has been read from its storage location; and upon verifying data integrity for the plurality of levels, performing a garbage collection process to reclaim storage from segments not referenced by a file in the storage system. 2. The method of claim 1 , further comprising: retrieving a data portion of each segment of the current level from the storage device; identifying one or more child segments of a child level with respect to the current level; and performing checksum verification on the child level after all segments of the current level have been processed. 3. The method of claim 2 , further comprising: for each checksum added to a parent checksum of each level, incrementing a parent counter associated with the corresponding level; and for each checksum added to a child checksum of each level, incrementing a child counter associated with the corresponding level. 4. The method of claim 3 , further comprising: after all bits of the walk vector have been populated, comparing a parent checksum with a child checksum of each level to determine if they match; comparing a parent counter and a child counter of each level to determine if they match; and performing the garbage collection process if all parent checksums and child checksums match and all parent counters and child counters match. 5. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations of verifying data integrity for garbage collection, the operations comprising: traversing, by a garbage collector executed by a processor, a namespace of a file system of a storage system to verify data integrity of segments, the namespace identifying a plurality of files that are represented by a plurality of segments arranged in a plurality of levels in a hierarchy, wherein an upper level segment includes one or more references to one or more lower level segments, wherein at least one segment is referenced by multiple files, wherein traversing the namespace includes verifying data integrity for all segments in a level-by-level manner comprising: verifying data integrity from checksums for an upper level segment, and verifying, in response to data integrity verification of the upper level segment, data integrity of a lower level segment from a lower level segment checksum, for each of the levels in the hierarchy, iteratively performing the following: obtaining fingerprints of all segments of a current level, computing checksums from the fingerprints of all segments of the current level, and adding the checksums of the current level to a parent checksum of the current level, computing checksums from the fingerprints of the current level that are read from the storage device, adding the checksums of the current level to a child checksum of the current level, for each of the fingerprints of the current level, marking a corresponding bit of a walk vector to indicate that a corresponding segment has been processed, wherein the walk vector includes a plurality of bits, each bit corresponding to one of the segments in the namespace, for each of the segments of the current level, retrieving a fingerprint of the segment from its storage location of a storage device, and marking a corresponding bit of a read vector to indicate that the segment has been read from its storage location; and upon verifying data integrity for the plurality of levels, performing a garbage collection process to reclaim storage from segments not referenced by a file in the storage system. 6. The medium of claim 5 , wherein the operations further comprise: retrieving a data portion of each segment of the current level from the storage device; identifying one or more child segments of a child level with respect to the current level; and performing checksum verification on the child level after all segments of the current level have been processed. 7. The medium of claim 6 , wherein the operations further comprise: for each checksum added to a parent checksum of each level, incrementing a parent counter associated with the corresponding level; and for each checksum added to a child checksum of each level, incrementing a child counter associated with the corresponding level. 8. The medium of claim 7 , wherein the operations further comprise: after all bits of the walk vector have been populated, comparing a parent checksum with a child checksum of each level to determine if they match; comparing a parent counter and a child counter of each level to determine if they match; and performing the garbage collection process if all parent checksums and child checksums match and all parent counters and child counters match. 9. A data storage system, comprising: a processor; and a memory coupled to the processor for storing instructions which when executed from the memory, cause the processor to perform operations, the operations including traversing, by a garbage collector executed by a processor, a namespace of a file system of a storage system to verify data integrity of segments, the namespace identifying a plurality of files that are represented by a plurality of segments arranged in a plurality of levels in a hierarchy, wherein an upper level segment includes one or more references to one or more lower level segments, wherein at least one segment is referenced by multiple files, wherein traversing the namespace includes verifying data integrity for all segments in a level-by-level manner comprising: verifying data integrity from checksums for an upper level segment, and verifying, in response to data integrity verification of the upper level segment, data integrity of a lower level segment from a lower level segment checksum, for each of the levels in the hierarchy, iteratively performing the fol
Saving storage space on storage systems · CPC title
Garbage collection, i.e. reclamation of unreferenced memory · CPC title
Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket · CPC title
Disk arrays, e.g. RAID, JBOD · CPC title
De-duplication techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.