Thining databases for garbage collection

US2020133548A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020133548-A1
Application numberUS-201916730066-A
CountryUS
Kind codeA1
Filing dateDec 30, 2019
Priority dateJan 31, 2017
Publication dateApr 30, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An implementation of the disclosure provides a system comprising a storage array comprising a storage controller coupled to the storage array. The storage controller comprising a processing device to remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map. The processing device further to update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced and trim the entry from the deduplication map that is associated with the record.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system comprising: a storage array; and a storage controller coupled to the storage array, the storage controller comprising a processing device, the processing device to: remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trim the entry from the deduplication map that is associated with the record. 2 . The system of claim 1 , wherein the processing device further to: select the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 3 . The system of claim 2 , wherein the processing device further to determine that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 4 . The system of claim 3 , wherein to determine the location, the processing device to: identify one of the sequence identifiers for the entry; and determine whether the sequence identifier is within the range of sequence identifiers. 5 . The system of claim 3 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, the processing device to issue an instruction to perform trim of the entry from the deduplication map that is associated with the record. 6 . The system of claim 2 , wherein the to select, the processing device to determine whether a hash value in the deduplication map for the deduplication references and the data block correspond. 7 . The system of claim 1 , wherein the storage array stores the data block. 8 . A method comprising: remapping a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; updating, by a processing device, an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trimming the entry from the deduplication map that is associated with the record. 9 . The method of claim 8 , further comprising: selecting the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 10 . The method of claim 9 , further comprising determining that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 11 . The method of claim 10 , wherein determining the location comprises: identifying one of the sequence identifiers for the entry; and determining whether the sequence identifier is within the range of sequence identifiers. 12 . The method of claim 10 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, issuing an instruction to performing trimming of the entry from the deduplication map that is associated with the record. 13 . The method of claim 9 , wherein the selecting comprises determining whether a hash value in the deduplication map for the deduplication references and the data block correspond. 14 . A non-transitory computer readable storage medium storing instructions, which when executed, cause a processing device to: remap, by the processing device, a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trim the entry from the deduplication map that is associated with the record. 15 . The non-transitory computer readable storage medium of claim 14 , wherein the processing device is further to select the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 16 . The non-transitory computer readable storage medium of claim 15 , wherein the processing device further to determine that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 17 . The non-transitory computer readable storage medium of claim 16 , wherein to determine the location, the processing device to: identify one of the sequence identifiers for the entry; and determine whether the sequence identifier is within the range of sequence identifiers. 18 . The non-transitory computer readable storage medium of claim 16 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, the processing device to issue an instruction to perform trim of the entry from the deduplication map that is associated with the record. 19 . The non-transitory computer readable storage medium of claim 15 , wherein the to select, the processing device to determine whether a hash value in the deduplication map for the deduplication references and the data block correspond. 20 . The non-transitory computer readable storage medium of claim 15 , wherein the data block is stored in a storage array.

Assignees

Inventors

Classifications

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • Cleaning, compaction, garbage collection, erase control · CPC title

  • Single storage device · CPC title

  • Saving storage space on storage systems · CPC title

  • Garbage collection, i.e. reclamation of unreferenced memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020133548A1 cover?
An implementation of the disclosure provides a system comprising a storage array comprising a storage controller coupled to the storage array. The storage controller comprising a processing device to remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map. The processing device further to …
Who is the assignee on this patent?
Pure Storage Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).