Shrinking segment cleaning algorithm in an object storage

US11435935B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11435935-B2
Application numberUS-202017100663-A
CountryUS
Kind codeB2
Filing dateNov 20, 2020
Priority dateNov 20, 2020
Publication dateSep 6, 2022
Grant dateSep 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for cleaning an object storage having a plurality of segments is provided. Each segment includes an identifier through which the segment is accessed. The method identifies a first segment in the plurality of segments. The first segment includes a first identifier and a first size. The method determines that a utilization ratio for the first segment is below a threshold. As a result, the method generates a second segment from the first segment, such that the second segment includes a second identifier that is the same as the first identifier and a second size that is smaller than the first size. The method then writes the second segment to the object storage.

First claim

Opening claim text (preview).

We claim: 1. A method for cleaning an object storage having a plurality of segments, each segment having an identifier through which the segment is accessed, comprising: identifying a first segment in the plurality of segments, the first segment having a first identifier and a first size wherein each segment of the plurality of segments comprises one or more data blocks that are either live or dead, wherein the first segment comprises at least one dead data block that is not included in a second segment; determining that a utilization ratio for the first segment is below a threshold; generating the second segment from the first segment, the second segment having a second identifier that is the same as the first identifier and a second size that is smaller than the first size; and writing the second segment to the object storage where a table stores data that indicates a corresponding identifier, a corresponding number of live data blocks, and a corresponding size of each of the plurality of segments, the method further comprising updating the table to include the second size as a size of the first segment. 2. The method of claim 1 , wherein the utilization ratio for the first segment comprises a ratio between a number of live blocks of the first segment and the size of the first segment. 3. The method of claim 1 , further comprising: receiving a first read instruction comprising the first identifier; returning the first segment in response to receiving the first read instruction; receiving a second read instruction comprising the first identifier; and returning the second segment in response to receiving the second read instruction. 4. The method of claim 1 , wherein the object storage comprises an atomic update with eventual consistency data storage. 5. The method of claim 1 , wherein a table stores a plurality of key-value pairs, each key-value pair comprising a logical block address that is mapped to a segment identifier of a segment of the plurality of segments in the object storage, the method further comprising, after identifying the first segment: identifying one or more logical blocks stored in the first segment by reading metadata associated with the first segment; and determining which of the one or more logical blocks is a live block using the table, wherein the utilization ratio for the first segment is determined based on a number of live blocks in the first segment and the first size of the first segment. 6. The method of claim 5 , wherein the second segment comprises only live blocks of the first segment, wherein the table is not updated for the second segment after generating the second segment from the first segment. 7. A non-transitory computer readable medium comprising instructions that, when executed by one or more processors of a computing system, cause the computing system to perform a method for cleaning an object storage having a plurality of segments, each segment having an identifier through which the segment is accessed, the method comprising: identifying a first segment in the plurality of segments, the first segment having a first identifier and a first size wherein each segment of the plurality of segments comprises one or more data blocks that are either live or dead, wherein the first segment comprises at least one dead data block that is not included in a second segment; determining that a utilization ratio for the first segment is below a threshold; generating the second segment from the first segment, the second segment having a second identifier that is the same as the first identifier and a second size that is smaller than the first size; and writing the second segment to the object storage wherein a table stores data that indicates a corresponding identifier, a corresponding number of live data blocks, and a corresponding size of each of the plurality of segments, the method further comprising updating the table to include the second size as a size of the first segment. 8. The non-transitory computer readable medium of claim 7 , wherein the utilization ratio for the first segment comprises a ratio between a number of live blocks of the first segment and the size of the first segment. 9. The non-transitory computer readable medium of claim 7 , the method further comprising: receiving a first read instruction comprising the first identifier; returning the first segment in response to receiving the first read instruction; receiving a second read instruction comprising the first identifier; and returning the second segment in response to receiving the second read instruction. 10. The non-transitory computer readable medium of claim 7 , wherein a table stores a plurality of key-value pairs, each key-value pair comprising a logical block address that is mapped to a segment identifier of a segment of the plurality of segments in the object storage, the method further comprising, after identifying the first segment: identifying one or more logical blocks stored in the first segment by reading metadata associated with the first segment; and determining which of the one or more logical blocks is a live block using the table, wherein the utilization ratio for the first segment is determined based on a number of live blocks in the first segment and the first size of the first segment. 11. The non-transitory computer readable medium of claim 10 , wherein the second segment comprises only live blocks of the first segment, wherein the table is not updated for the second segment after generating the second segment from the first segment. 12. A computer system, comprising: a memory; and a processor coupled to the memory, the processor being configured to: identify a first segment in a plurality of segments, the first segment having a first identifier and a first size wherein each segment of the plurality of segments comprises one or more data blocks that are either live or dead, wherein the first segment comprises at least one dead data block that is not included in a second segment; determine that a utilization ratio for the first segment is below a threshold; generate the second segment from the first segment, the second segment having a second identifier that is the same as the first identifier and a second size that is smaller than the first size; and write the second segment to an object storage wherein a table stores data that indicates a corresponding identifier, a corresponding number of live data blocks, and a corresponding size of each of the plurality of segments, the processor being further configured to update the table to include the second size as a size of the first segment. 13. The computer system of claim 12 , wherein the utilization ratio for the first segment comprises a ratio between a number of live blocks of the first segment and the size of the first segment. 14. The computer system of claim 12 , wherein the processor is further configured to: receive a first read instruction comprising the first identifier; return the first segment in response to receiving the first read instruction; receive a second read instruction comprising the first identifier; and return the second segment in response to receiving the second read instruction.

Assignees

Inventors

Classifications

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

  • Single storage device · CPC title

  • Details of free space management performed by the file system (saving storage space on storage systems G06F3/0608; management of blocks in storage devices G06F3/064) · CPC title

  • Virtual · CPC title

  • G06F3/067Primary

    Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11435935B2 cover?
A method for cleaning an object storage having a plurality of segments is provided. Each segment includes an identifier through which the segment is accessed. The method identifies a first segment in the plurality of segments. The first segment includes a first identifier and a first size. The method determines that a utilization ratio for the first segment is below a threshold. As a result, th…
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/067. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).