Storage tiering for deduplicated storage environments

US11809379B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11809379-B2
Application numberUS-201916688019-A
CountryUS
Kind codeB2
Filing dateNov 19, 2019
Priority dateNov 19, 2019
Publication dateNov 7, 2023
Grant dateNov 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure include a computer-implemented method, a computer program product, and a system for storing data based, at least partially, on the deduplication rates of a storage system within a storage environment. The computer-implemented method includes receiving data to be stored in a storage environment, computing a hash for the received data, and querying storage deduplication agents for statuses of storage systems within the storage environment. The computer-implemented method also includes receiving deduplication rates and hash tables relating to the storage systems from the storage deduplication agents. The computer-implemented method further includes analyzing stored data stored on the storage systems using the deduplication rates and the hash tables and comparing the stored data to the received data. The computer-implemented method further includes allocating the received data to a storage system within the storage environment based on the comparison of the stored data to the received data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving data to be stored in a storage environment; computing a hash relating to the received data; querying storage deduplication agents for storage analytics from storage systems within the storage environment; receiving the storage analytics from the storage deduplication agents, wherein the storage analytics include hash tables for the storage systems containing deduplicated data; comparing the hash to the hash tables to detect similarities; determining a performance requirement related to deduplication rates for a storage tier for the received data, wherein the determining comprises determining when the storage tier is a flash-based system, a disk storage system, and a tape library system; analyzing performance capabilities relating to the storage systems for the storage tier by relating, to the storage tier, a media type selected from the group consisting of flash-based, disk, and tape; monitoring deduplication performance analytics relating to the deduplication rates of the storage systems for the storage tier; allocating the received data to the storage system based on the similarities relating to the hash and the hash table, the performance requirement for the received data, and the performance capabilities relating to the storage systems to optimize the available storage capacity within the storage systems; detecting a performance degradation relating to storage capacity from at least one storage system based on the storage analytics received; calculating unduplicated data within the at least one storage system; determining a different storage system within the storage environment to migrate the unduplicated data onto; migrating the unduplicated data to the different storage system; analyzing the hash tables to determine similarities between stored data across the storage environment; detecting similar data based on the hash tables stored in separate storage systems; and migrating the similar data to one storage system. 2. The computer-implemented method of claim 1 , wherein the storage analytics further include deduplication rates for the storage systems containing the deduplicated data. 3. The computer-implemented method of claim 1 , wherein determining the different storage system comprises: determining performance capabilities for the storage systems; and selecting the different storage system based on the performance capabilities and the unduplicated data. 4. The computer-implemented method of claim 1 , further comprising calculating deduplication rates for each of the storage systems based on the hash tables received from the storage deduplication agents. 5. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform a method comprising: receiving data to be stored in a storage environment; computing a hash relating to the received data; querying storage deduplication agents for storage analytics from storage systems within the storage environment; receiving the storage analytics from the storage deduplication agents, wherein the storage analytics include hash tables for the storage systems containing deduplicated data; comparing the hash to the hash tables to detect similarities; determining a performance requirement related to deduplication rates for a storage tier for the received data, wherein the determining comprises determining when the storage tier is a flash-based system, a disk storage system, and a tape library system; analyzing performance capabilities relating to the storage systems for the storage tier by relating, to the storage tier, a media type selected from the group consisting of flash-based, disk, and tape; monitoring deduplication performance analytics relating to the deduplication rates of the storage systems for the storage tier; allocating the received data to the storage system based on the similarities relating to the hash and the hash table, the performance requirement for the received data, and the performance capabilities relating to the storage systems to optimize the available storage capacity within the storage systems; detecting a performance degradation relating to storage capacity from at least one storage system based on the storage analytics received; calculating unduplicated data within the at least one storage system; determining a different storage system within the storage environment to migrate the unduplicated data onto; migrating the unduplicated data to the different storage system; analyzing the hash tables to determine similarities between stored data across the storage environment; detecting similar data based on the hash tables stored in separate storage systems; and migrating the similar data to one storage system. 6. The computer program product of claim 5 , wherein the storage analytics further include deduplication rates for the storage systems containing the deduplicated data. 7. The computer program product of claim 5 , wherein determining the different storage system comprises: determining performance capabilities for the storage systems; and selecting the different storage system based on the performance capabilities and the unduplicated data. 8. The computer program product of claim 5 , further comprising calculating deduplication rates for each of the storage systems based on the hash tables received from the storage deduplication agents. 9. A system comprising: a memory storing program instructions; and a processor, wherein the processor is configured to execute the program instructions to perform a method comprising: receiving data to be stored in a storage environment; computing a hash relating to the received data; querying storage deduplication agents for storage analytics from storage systems within the storage environment; receiving the storage analytics from the storage deduplication agents, wherein the storage analytics include hash tables for the storage systems containing deduplicated data; comparing the hash to the hash tables to detect similarities; determining a performance requirement for the received data; determining a performance requirement related to deduplication rates for a storage tier for the received data, wherein the determining comprises determining when the storage tier is a flash-based system, a disk storage system, and a tape library system; analyzing performance capabilities relating to the storage systems for the storage tier by relating, to the storage tier, a media type selected from the group consisting of flash-based, disk, and tape; monitoring deduplication performance analytics relating to the deduplication rates of the storage systems for the storage tier; allocating the received data to the storage system based on the similarities relating to the hash and the hash table, the performance requirement for the received data, and the performance capabilities relating to the storage systems to optimize the available storage capacity within the storage systems; detecting a performance degradation relating to storage capacity from at least one storage system based on the storage analytics received; calculating unduplicated data within the at least one storage system; determining a different storage system within the storage environment to migrate the unduplicated data onto; migrating the unduplicated data to the different storage system; analyzing the hash tables to determine similarities between stored data across the storage environment; detecting similar data based on the hash tables stored in separate storage systems; and migrating the similar d

Assignees

Inventors

Classifications

  • based on file chunks · CPC title

  • Saving storage space on storage systems · CPC title

  • De-duplication techniques · CPC title

  • Lifecycle management · CPC title

  • Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11809379B2 cover?
Embodiments of the present disclosure include a computer-implemented method, a computer program product, and a system for storing data based, at least partially, on the deduplication rates of a storage system within a storage environment. The computer-implemented method includes receiving data to be stored in a storage environment, computing a hash for the received data, and querying storage de…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/1752. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).