Consolidating garbage collector in a data storage system

US2021191856A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021191856-A1
Application numberUS-201916718776-A
CountryUS
Kind codeA1
Filing dateDec 18, 2019
Priority dateDec 18, 2019
Publication dateJun 24, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology described herein is directed towards consolidating garbage collection of data stored in data structures such as chunks, to facilitate efficient garbage collection. Low capacity utilization chunks are detected as source chunks, and live data of an object (e.g., in segments) is copied from the source chunks to new destination chunk(s). A source chunk is deleted when it no longer contains live data. By copying the data on an object-determined basis, new chunks contain more coherent object data, which increases the possibility of future chunk deletion without data copying or with a reduced amount of copying. When data segments of an object are adjacent, the consolidating garbage collector may unite them into a united segment, which reduces an amount of system metadata per object. New chunks can be associated with a generation number (e.g., indicating the oldest previous generation) to further facilitate more efficient future chunk deletion.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system, comprising: a processor, and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, the operations comprising: determining garbage collection candidate chunks in a data storage system; scanning object metadata of the data storage system to determine a selected object with live segments in the garbage collection candidate chunks; and copying the live segments of the selected object from the garbage collection candidate chunks to contiguous destination chunk storage space. 2 . The system of claim 1 , wherein the operations further comprise garbage collecting a garbage collection candidate chunk in response to no live segment of the garbage collection candidate chunk remains being determined to be uncopied. 3 . The system of claim 1 , wherein the contiguous destination chunk storage space comprises storage space of a single chunk. 4 . The system of claim 1 , wherein the contiguous destination chunk storage space comprises storage space of a sequence of two or more chunks, wherein at least one chunk of the sequence of two or more chunks comprises contiguous storage space. 5 . The system of claim 1 , wherein the operations further comprise determining one or more expanded source chunks that maintain one or more live segments of the selected object, and copying the one or more live segments of the selected object from the expanded source chunks to the contiguous destination chunk storage space. 6 . The system of claim 1 , wherein the operations further comprise detecting adjacent live segments of the object in the contiguous destination chunk storage space, and modifying object metadata of the adjacent live segments to unite the adjacent live segments into a combined segment in the contiguous destination chunk storage space. 7 . The system of claim 1 , wherein a first one of the live segments corresponds to a first chunk associated with a first generation number, wherein a second one of the live segments corresponds to a second chunk associated with a second generation number that is different from the first generation number, and wherein the operations further comprise selecting the first generation number, and associating the first generation number with the contiguous chunk storage space. 8 . The system of claim 7 , wherein the selecting the first generation number comprises determining that the first generation number represents an older generation than a generation represented by the second generation number. 9 . A method comprising: determining, by a system comprising a processor, objects with live segments in source chunks, the source chunks comprising garbage collection candidate chunks; consolidating the live segments of respective objects into respective destination chunk storage space corresponding to one or more chunks; and garbage collecting the garbage collection candidate chunks when no live segments remain uncopied in the garbage collection candidate chunks. 10 . The method of claim 9 , wherein the determining the objects with live segments in the source chunks comprises locating one or more expanded chunks that contain live segments. 11 . The method of claim 9 , wherein the consolidating the live segments of the respective objects into the respective destination chunk storage space comprises accessing metadata of an object to determine segments of the live segments of the object, determining that the segments of the live segments of the object will fit into a single chunk's space, and copying the segments of the live segments of the object to a single chunk. 12 . The method of claim 9 , wherein the consolidating the live segments of the respective objects into the respective destination chunk storage space comprises accessing metadata of an object to determine segments of the live segments of the object, determining that the segments of the live segments of the object will not fit into a single chunk's space, and copying the segments of the live segments of the object to two or more chunks. 13 . The method of claim 9 , further comprising detecting adjacent live segments of an object in the destination chunk storage space, and modifying object metadata of the adjacent live segments to unite the adjacent live segments into a combined segment in the destination chunk storage space. 14 . The method of claim 9 , wherein a first live segment of an object corresponds to a first chunk associated with a first generation number, wherein a second live segment of the object corresponds to a second chunk associated with a second generation number that is older than the first generation number, and wherein the method further comprises associating the second generation number with the destination chunk storage space. 15 . A machine-readable storage medium, comprising executable instructions that, when executed by a processor, facilitate performance of operations, the operations comprising: determining source chunks comprising chunks with low usage capacity in a data storage system; processing object metadata of the data storage system to determine a selected object with live segments in the source chunks; allocating a destination chunk; and consolidating the live segments of the selected object from the source chunks into the destination chunk. 16 . The machine-readable storage medium of claim 15 , wherein the operations further comprise determining that a source chunk that contained a live segment of the object before the consolidating has no live segments of an object remaining therein after the consolidating, and, in response to the determining, garbage collecting the source chunk. 17 . The machine-readable storage medium of claim 15 , wherein the selected object with the live segments in the source chunks comprises a first selected object, and wherein the operations further comprise processing the object metadata of the data storage system to determine a second selected object with live segments in the source chunks, and consolidating the live segments of the second selected object from the source chunks into the destination chunk. 18 . The machine-readable storage medium of claim 15 , wherein the operations further comprise detecting adjacent live segments of the object in the destination chunk, and modifying the object metadata of the adjacent to unite the adjacent live segments into a combined segment in the destination chunk. 19 . The machine-readable storage medium of claim 15 , wherein a first live segment of an object corresponds to a first source chunk associated with a first generation number, wherein a second live segment of the object corresponds to a second source chunk associated with a second generation number that is older than the first generation number, and wherein the operations further comprise, associating the second generation number with the destination chunk. 20 . The machine-readable storage medium of claim 15 , wherein the determining the source chunks comprises, scanning the object table to determine low-capacity utilization chunks and to determine an expanded chunk set comprising at least one chunk that contains at least one live segment of the selected object.

Assignees

Inventors

Classifications

  • Generational garbage collection · CPC title

  • Garbage collection, i.e. reclamation of unreferenced memory · CPC title

  • Latency reduction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021191856A1 cover?
The technology described herein is directed towards consolidating garbage collection of data stored in data structures such as chunks, to facilitate efficient garbage collection. Low capacity utilization chunks are detected as source chunks, and live data of an object (e.g., in segments) is copied from the source chunks to new destination chunk(s). A source chunk is deleted when it no longer co…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F12/0253. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 24 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).