Managing data deduplication in storage systems based on storage space characteristics

US9529545B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9529545-B1
Application numberUS-201314141258-A
CountryUS
Kind codeB1
Filing dateDec 26, 2013
Priority dateDec 26, 2013
Publication dateDec 27, 2016
Grant dateDec 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method is used in managing data deduplication in storage systems based on storage space characteristics. Characteristics of first and second storage tiers are evaluated. A first data object resides on the first storage tier and a second data object resides on the second storage tier. The first and second data objects are selected for applying a deduplicating technique. A data storage system includes the first and second storage tiers configured such that performance characteristics associated with the first storage tier is different from the second storage tier. Based on the evaluation, the deduplicating technique is applied to the first and second data objects.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for use in managing data deduplication in storage systems based on storage space characteristics, the method comprising: receiving a request to deduplicate a data object; identifying a candidate data object for deduplicating the data object; evaluating characteristics of storage tiers on which the data object and the candidate data object reside, wherein the data object resides on a first storage tier and the candidate data object resides on a second storage tier, wherein a data storage system includes the first storage tier and the second storage tier configured such that performance characteristics associated with the first storage tier are different from performance characteristics associated with the second storage tier; and based on the evaluating, selecting a master deduplicated copy from a group consisting of the data object and the candidate data object, wherein the data object is selected as the master deduplicated copy upon determining that performance characteristics associated with the first storage tier are higher than performance characteristics associated with the second storage tier, wherein the candidate data object is selected as the master deduplicated copy upon determining that performance characteristics associated with the second storage tier are higher than performance characteristics associated with the first storage tier; and based on the selecting, applying a deduplicating technique to the data object and the candidate data object, wherein the data object is deduplicated to the candidate data object by updating mapping information of the data object to point to the candidate data object upon selection of the candidate data object as the master deduplicated copy, wherein the candidate data object is deduplicated to the data object by updating mapping information of the candidate data object to point to the data object upon selection of the data object as the master deduplicated copy. 2. The method of claim 1 , wherein evaluating characteristics of the first and second storage tiers further comprises: comparing performance characteristics of the first and second storage tiers. 3. The method of claim 1 , wherein the data object and the candidate data object are selected from the group consisting of a deduplication domain, a storage extent, a Logical Unit Number (LUN), a file, a slice and a data block, wherein the data block is a fixed size chunk of a physical disk storage. 4. The method of claim 3 , wherein a slice is a logical representation of a subset of physical disk storage. 5. The method of claim 1 , wherein a storage tier includes a disk drive system comprising a plurality of Redundant Array of Inexpensive Disks (RAID) systems, each RAID system of the plurality of RAID systems having a first disk drive and a second disk drive. 6. The method of claim 3 , wherein a deduplication domain comprises a set of storage extents, wherein each storage extent of the set of storage extents comprises a set of LUNs, wherein each LUN of the set of LUNs is a logical representation of a subset of physical disk storage. 7. The method of claim 1 , wherein applying a deduplicating technique further comprises: based on the evaluating, updating mapping information of the data object and the candidate data object. 8. A system for use in managing data deduplication in storage systems based on storage space characteristics, the system comprising: first logic receiving a request to deduplicate a data object; second logic identifying a candidate data object for deduplicating the data object; third logic evaluating characteristics of storage tiers on which the data object and the candidate data object reside, wherein the data object resides on a first storage tier and the candidate data object resides on a second storage tier, wherein a data storage system includes the first storage tier and the second storage tier configured such that performance characteristics associated with the first storage tier are different from performance characteristics associated with the second storage tier; fourth logic selecting, based on the evaluating, a master deduplicated copy from a group consisting of the data object and the candidate data object, wherein the data object is selected as the master deduplicated copy upon determining that performance characteristics associated with the first storage tier are higher than performance characteristics associated with the second storage tier, wherein the candidate data object is selected as the master deduplicated copy upon determining that performance characteristics associated with the second storage tier are higher than performance characteristics associated with the first storage tier; and fifth logic applying, based on the selecting, a deduplicating technique to the data object and the candidate data object, wherein the data object is deduplicated to the candidate data object by updating mapping information of the data object to point to the candidate data object upon selection of the candidate data object as the master deduplicated copy, wherein the candidate data object is deduplicated to the data object by updating mapping information of the candidate data object to point to the data object upon selection of the data object as the master deduplicated copy. 9. The system of claim 8 , wherein evaluating characteristics of the first and second storage tiers further comprises: sixth logic comparing performance characteristics of the first and second storage tiers. 10. The system of claim 8 , wherein the data object and the candidate data object are selected from the group consisting of a deduplication domain, a storage extent, a Logical Unit Number (LUN), a file, a slice and a data block, wherein the data block is a fixed size chunk of a physical disk storage. 11. The system of claim 10 , wherein a slice is a logical representation of a subset of physical disk storage. 12. The system of claim 8 , wherein a storage tier includes a disk drive system comprising a plurality of Redundant Array of Inexpensive Disks (RAID) systems, each RAID system of the plurality of RAID systems having a first disk drive and a second disk drive. 13. The system of claim 10 , wherein a deduplication domain comprises a set of storage extents, wherein each storage extent of the set of storage extents comprises a set of LUNs, wherein each LUN of the set of LUNs is a logical representation of a subset of physical disk storage. 14. The system of claim 8 , wherein applying a deduplicating technique further comprises: sixth logic updating, based on the evaluating, mapping information of the data object and the candidate data object.

Assignees

Inventors

Classifications

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • Disk arrays, e.g. RAID, JBOD · CPC title

  • G06F3/0685Primary

    Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title

  • Saving storage space on storage systems · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9529545B1 cover?
A method is used in managing data deduplication in storage systems based on storage space characteristics. Characteristics of first and second storage tiers are evaluated. A first data object resides on the first storage tier and a second data object resides on the second storage tier. The first and second data objects are selected for applying a deduplicating technique. A data storage system i…
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).