Storage system garbage collection and defragmentation
US-2022179828-A1 · Jun 9, 2022 · US
US11755366B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11755366-B2 |
| Application number | US-202017008874-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 1, 2020 |
| Priority date | Sep 1, 2020 |
| Publication date | Sep 12, 2023 |
| Grant date | Sep 12, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The technology describes scanning tree data structures (trees) for multiple processes, at least partly in parallel. A service scans a tree from a beginning tree element to an ending tree element on behalf of a process; while scanning, another process can join in the scan at an intermediate tree element location (e.g., a key). For the subsequent process, the service scans the tree based on the intermediate location to the tree end, thereby visiting tree elements in parallel until the tree end, then continuing from the tree beginning element to the intermediate location for the subsequent process. The service basically completes a full carousel-type revolution for each process. One or more other processes can join an ongoing scan at any time, facilitating further parallel tree element visits, while still obtaining a full scan of the entire set of tree elements. The service handles changing tree versions during the scanning.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor of a data storage system, facilitate performance of operations, the operations comprising: scanning a first tree data structure, in a first partial traversal, to visit tree elements for a first process starting from a tree beginning to an intermediate tree location visited before a second process joins the scanning; continuing the scanning, in a second partial traversal, to visit further tree elements for the first process and the second process starting from the next tree location after the intermediate tree location, until an end of the tree is visited; continuing the scanning, in a third partial traversal, to visit additional tree elements for the second process starting from the tree beginning and ending the scanning when the intermediate tree location is visited; and after the first tree data structure version has changed to a second tree data structure version, resuming the scanning in the second tree data structure version based on a checkpoint corresponding to a last-visited tree element of the first tree data structure version comprising resuming from a next tree element after the checkpoint while presuming a tree element corresponding to the checkpoint had not been deleted. 2. The system of claim 1 , wherein the operations further comprise maintaining the checkpoint corresponding to the last-visited tree element of the first tree data structure version. 3. The system of claim 1 , wherein the checkpoint further corresponds to the tree element, and wherein the tree element has been deleted in the second tree data structure version. 4. The system of claim 1 , wherein the tree beginning is a first tree beginning, wherein the checkpoint further corresponds to a tree element with no next tree element, and wherein the resuming the scanning comprises resuming from a second tree beginning corresponding to the continuing the scanning in the third partial traversal. 5. The system of claim 1 , wherein the tree data structure comprises a directory table of the data storage system. 6. The system of claim 1 , wherein the tree data structure comprises an object table of the data storage system. 7. The system of claim 6 , wherein the scanning for the first process scans the object table to detect unused data chunks corresponding to user data or tree data. 8. The system of claim 6 , wherein the scanning for the first process scans the object table to rebuild a Bloom filter associated with the object table. 9. The system of claim 6 , wherein the scanning for the first process scans the object table to determine consistency between the object table and another directory table other than the object table. 10. The system of claim 1 , wherein the scanning is performed by a data service, and wherein the first process comprises a first client of the data service, and the second process comprises a second client of the data service. 11. A method, comprising: scanning, via a processor of a data storage service, a tree data structure to report on visited tree elements, the scanning comprising: detecting a first event that triggers a tree scanning operation for a first process, and in response to the detecting the first event, scanning the tree data structure from a tree beginning location to a tree ending location to report on tree elements visited according to the first process; and detecting a second event that triggers a tree scanning operation for a second process before the scanning the tree data structure for the first process has completed, and in response to the detecting the second event, scanning the tree data structure based on an intermediate tree location corresponding to a current location of the tree scanning for the first process to the tree ending location and scanning the tree from the tree beginning to the intermediate tree location, to report on the tree elements visited according to the second process, wherein the tree data structure corresponds to a first tree data structure version that changes to a second tree data structure version, and further comprising: maintaining a checkpoint corresponding to a last-visited tree element of the first tree data structure version, and resuming the scanning operation in the second tree data structure version based on the checkpoint comprising resuming from a next tree element after the checkpoint while presuming a tree element corresponding to the checkpoint had not been deleted. 12. The method of claim 11 , wherein the intermediate tree location is a first intermediate tree location, and further comprising detecting a third event that triggers a tree scanning operation for a third process before the scanning the tree for the second process has completed, and in response the detecting the third event, scanning the tree based on a second intermediate tree location corresponding to a current location of the tree scanning for the second process to the tree ending location and scanning the tree from the tree beginning to the second intermediate tree location, to report on the tree elements visited according to the third process. 13. The method of claim 12 , further comprising detecting a fourth event that triggers a tree scanning operation for a fourth process before the scanning the tree for the third process has completed, and in response the detecting the fourth event, scanning the tree based on a third intermediate tree location corresponding to a current location of the tree scanning for the third process to the tree ending location and scanning the tree from the tree beginning to the third intermediate tree location, to report on the tree elements visited according to the fourth process. 14. The method of claim 11 , wherein the checkpoint further corresponds to a tree element that has been deleted in the second tree data structure version, and wherein the resuming the scanning operation comprises resuming from a next tree element after the checkpoint as if the tree element corresponding to the checkpoint had not been deleted. 15. The method of claim 11 , wherein the checkpoint further corresponds to a tree element with no next tree element, and wherein the resuming the scanning comprises resuming from the tree beginning. 16. A non-transitory machine-readable medium, comprising executable instructions that, when executed by a processor of a data storage system, facilitate performance of operations, the operations comprising: performing a tree scanning operation for a first process, comprising traversing the tree data structure from a tree beginning location to a tree ending location to report on tree elements visited pursuant to the first process; continuing the tree scanning operation for a second process before the traversing the tree data structure for the first process has completed, comprising traversing the tree data structure from a first intermediate tree location corresponding to a current location of the tree traversal for the first process to the tree ending location, and traversing the tree data structure from the tree beginning location to the first intermediate tree location, to report on the tree elements visited pursuant to the second process; and continuing the tree scanning operation for a third process before the traversing the tree data structure for the second process has completed, comprising traversing the tree data structure from a second intermediate tree location corresponding to a current location of the tree traversal for the second process to the tree ending location, and traversing
File access structures, e.g. distributed indices (arrangements of input from, or output to, record carriers G06F3/06) · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Pointer or reference processing operations · CPC title
Scheduler internals · CPC title
Ensuring data consistency and integrity · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.