Determining capacity in a global deduplication system

US11403233B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11403233-B2
Application numberUS-201916653519-A
CountryUS
Kind codeB2
Filing dateOct 15, 2019
Priority dateOct 15, 2019
Publication dateAug 2, 2022
Grant dateAug 2, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An aspect of determining per volume exclusive capacity in a deduplication system includes setting a percentage of a population of pages for selection. For each of the pages, an aspect includes selecting a page in the population, providing a data segment facilitating multiple references of the segment by at least one storage entity, maintaining counts corresponding with each segment in the page, and determining exclusive ownership of the page based on the counts and a key value of one of a plurality of storage entities.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for determining per volume exclusive capacity in a deduplication system, the method comprising: (a) setting a percentage of a population of pages for selection; (b) selecting a page in the population based on the set percentage; (c) providing a data segment in the system that facilitates multiple references of the data segment by at least one storage entity; (d) maintaining a plurality of counts in connection with the data segment in the selected page, the plurality of counts including a first count representing a sum of first values that identify respective storage entities associated with each reference of the data segment; (e) determining exclusive ownership of the selected page based on the plurality of counts and a key value of one of a plurality of storage entities; and (f) updating metadata for the selected page with results of the plurality of counts, the metadata including available capacity associated with the selected page, wherein updating the metadata for the selected page includes storing the results in a virtual block storage address space corresponding with the selected page, wherein steps (b)-(f) are repeated until the set percentage is reached. 2. The method of claim 1 , wherein selecting a page in the population based on the percentage includes selecting a page with a hash value in which a number of the hash's least significant bits equal zero. 3. The method of claim 1 , wherein the plurality of counts further includes: a second count representing a sum of second values derived from key values of the respective storage entities associated with each reference of the data segment, and a third count representing a number of references of the data segment by the respective storage entities. 4. The method of claim 1 , wherein the virtual block storage address space references a virtual logical block, the virtual logical block indicating an indirection page between a logical address and a physical address of blocks corresponding to the selected page. 5. The method of claim 1 , further comprising detecting a total amount of capacity for the selected pages in the population. 6. The method of claim 1 , further comprising performing a statistical estimation calculation for capacity of the global deduplication system using pages in the population not selected. 7. A system for determining per volume exclusive capacity in a deduplication system, the system comprising: a memory comprising computer-executable instructions; and a processor executing the computer-executable instructions, the computer-executable instructions when executed by the processor cause the processor to perform operations comprising: (a) setting a percentage of a population of pages for selection; (b) selecting a page in the population based on the set percentage; (c) providing a data segment in the system that facilitates multiple references of the data segment by at least one storage entity; (d) maintaining a plurality of counts in connection with the data segment in the selected page, the plurality of counts including a first count representing a sum of first values that identify respective storage entities associated with each reference of the data segment; (e) determining exclusive ownership of the selected page based on the plurality of counts and a key value of one of a plurality of storage entities; and (f) updating metadata for the selected page with results of the plurality of counts, the metadata including available capacity associated with the selected page, wherein updating the metadata for the selected page includes storing the results in a virtual block storage address space corresponding with the selected page, wherein steps (b)-(f) are repeated until the set percentage is reached. 8. The system of claim 7 , wherein selecting a page in the population based on the percentage includes selecting a page with a hash value in which a number of the hash's least significant bits equal zero. 9. The system of claim 7 , wherein the plurality of counts further includes: a second count representing a sum of second values derived from key values of the respective storage entities associated with each reference of the data segment, and a third count representing a number of references of the data segment by the respective storage entities. 10. The system of claim 7 , wherein the virtual block storage address space references a virtual logical block, the virtual logical block indicating an indirection page between a logical address and a physical address of blocks corresponding to the selected page. 11. The system of claim 7 , wherein the operations further include detecting a total amount of capacity for the selected pages in the population. 12. A computer program product for determining per volume exclusive capacity in a deduplication system, the computer program product embodied on a non-transitory computer readable medium, and the computer program product including instructions that, when executed by a computer, causes the computer to perform operations, the operations including: (a) setting a percentage of a population of pages for selection; (b) selecting a page in the population based on the set percentage; (c) providing a data segment in the system that facilitates multiple references of the data segment by at least one storage entity; (d) maintaining a plurality of counts in connection with the data segment in the selected page, the plurality of counts including a first count representing a sum of first values that identify respective storage entities associated with each reference of the data segment; (e) determining exclusive ownership of the selected page based on the plurality of counts and a key value of one of a plurality of storage entities; and (f) updating metadata for the selected page with results of the plurality of counts, the metadata including available capacity associated with the selected page, wherein updating the metadata for the selected page includes storing the results in a virtual block storage address space corresponding with the selected page, wherein steps (b)-(f) are repeated until the set percentage is reached. 13. The computer program product of claim 12 , wherein selecting a page in the population based on the percentage includes selecting a page with a hash value in which a number of the hash's least significant bits equal zero. 14. The computer program product of claim 12 , wherein the plurality of counts include: a first count representing a sum of first values that identify respective storage entities associated with each reference of the data segment, a second count representing a sum of second values derived from key values of the respective storage entities associated with each reference of the data segment, and a third count representing a number of references of the data segment by the respective storage entities. 15. The computer program product of claim 12 , wherein the virtual block storage address space references a virtual logical block, the virtual logical block indicating an indirection page between a logical address and a physical address of blocks corresponding to the selected page. 16. The computer program product of claim 12 , wherein the operations further include detecting a total amount of capacity for the selected pages in the population.

Assignees

Inventors

Classifications

  • Space efficiency improvement · CPC title

  • G06F12/126Primary

    with special data handling, e.g. priority of data or instructions, handling errors or pinning · CPC title

  • involving hashing techniques, e.g. inverted page tables · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11403233B2 cover?
An aspect of determining per volume exclusive capacity in a deduplication system includes setting a percentage of a population of pages for selection. For each of the pages, an aspect includes selecting a page in the population, providing a data segment facilitating multiple references of the segment by at least one storage entity, maintaining counts corresponding with each segment in the page,…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F12/126. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 02 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).