Thining databases for garbage collection

US11262929B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11262929-B2
Application numberUS-201916730066-A
CountryUS
Kind codeB2
Filing dateDec 30, 2019
Priority dateJan 31, 2017
Publication dateMar 1, 2022
Grant dateMar 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An implementation of the disclosure provides a system comprising a storage array comprising a storage controller coupled to the storage array. The storage controller comprising a processing device to remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map. The processing device further to update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced and trim the entry from the deduplication map that is associated with the record.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a storage array; and a storage controller coupled to the storage array, the storage controller comprising a processing device, the processing device to: remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trim the entry from the deduplication map that is associated with the record, wherein the processing device further to determine that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 2. The system of claim 1 , wherein the processing device further to: select the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 3. The system of claim 2 , wherein the selecting comprises the processing device determining whether a hash value in the deduplication map for the deduplication references and the data block correspond to each other. 4. The system of claim 1 , wherein to determine the location, the processing device to: identify one of the sequence identifiers for the entry; and determine whether the sequence identifier is within the range of sequence identifiers. 5. The system of claim 1 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, the processing device to issue an instruction to perform trim of the entry from the deduplication map that is associated with the record. 6. The system of claim 1 , wherein the storage array stores the data block. 7. A method comprising: remapping a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; updating, by a processing device, an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trimming the entry from the deduplication map that is associated with the record; and determining that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 8. The method of claim 7 , further comprising: selecting the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 9. The method of claim 8 , wherein the selecting comprises determining whether a hash value in the deduplication map for the deduplication references and the data block correspond to each other. 10. The method of claim 7 , wherein determining the location comprises: identifying one of the sequence identifiers for the entry; and determining whether the sequence identifier is within the range of sequence identifiers. 11. The method of claim 7 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, issuing an instruction to performing trimming of the entry from the deduplication map that is associated with the record. 12. A non-transitory computer readable storage medium storing instructions, which when executed, cause a processing device to: remap, by the processing device, a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map; update an entry of the deduplication map associated with the plurality of deduplication references with a record indicating that the entry is no longer referenced; trim the entry from the deduplication map that is associated with the record; and determine that a location of the entry corresponds to an original entry of the data block in the deduplication map based on a range of sequence identifiers associated with updates to the deduplication map. 13. The non-transitory computer readable storage medium of claim 12 , wherein the processing device is further to select the earlier occurrence of duplicate data associated with the plurality of deduplication references in the deduplication map, wherein the deduplication references represent duplicate data of the data block in the deduplication map. 14. The non-transitory computer readable storage medium of claim 13 , wherein the selecting comprises the processing device determining whether a hash value in the deduplication map for the deduplication references and the data block correspond to each other. 15. The non-transitory computer readable storage medium of claim 13 , wherein the data block is stored in a storage array. 16. The non-transitory computer readable storage medium of claim 12 , wherein to determine the location, the processing device to: identify one of the sequence identifiers for the entry; and determine whether the sequence identifier is within the range of sequence identifiers. 17. The non-transitory computer readable storage medium of claim 12 , wherein responsive to determining that the location of the entry corresponds to the original entry of the data block in the deduplication map, the processing device to issue an instruction to perform trim of the entry from the deduplication map that is associated with the record.

Assignees

Inventors

Classifications

  • Saving storage space on storage systems · CPC title

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • Cleaning, compaction, garbage collection, erase control · CPC title

  • Single storage device · CPC title

  • Improving I/O performance · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11262929B2 cover?
An implementation of the disclosure provides a system comprising a storage array comprising a storage controller coupled to the storage array. The storage controller comprising a processing device to remap a plurality of deduplication references in a deduplication map to point to an earlier occurrence of duplicate data of a data block for the deduplication map. The processing device further to …
Who is the assignee on this patent?
Pure Storage Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).