What technology area does this patent fall under?

Primary CPC classification G06F16/1748. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Application aware export to object storage of low-reference data in deduplication repositories

US10956382B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10956382-B2
Application number	US-201615082251-A
Country	US
Kind code	B2
Filing date	Mar 28, 2016
Priority date	Mar 28, 2016
Publication date	Mar 23, 2021
Grant date	Mar 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments for managing data in a data deduplication repository in a computing storage environment, by a processor device, are provided. In one embodiment, a method comprises issuing an application programming interface (API) command to scan metadata of a subset of entities in a local deduplication repository for identifying candidate data to offload from the local deduplication repository to an object storage, offloading the candidate data to the object storage, and returning a status result using the API command.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for managing data in a data deduplication repository in a computing storage environment, by a processor device, comprising: issuing an application programming interface (API) command, by an existing backup management application executing on a host to a cloud deduplicating gateway, to scan metadata by the cloud deduplicating gateway of a subset of entities in a local deduplication repository stored on one or more storage devices associated with the host for identifying candidate data to offload from the local deduplication repository to an object storage, offloading the candidate data to the object storage, and returning a status result using the API command to the existing backup management application; wherein the local deduplication repository is stored on-premise and the object storage is stored off-premise such that the offloading is performed to lower a consumption of on-premise storage by migrating the candidate data identified during the scan of the metadata of the subset of entities to the object storage being stored off-premise; and according to an output of the scanning, identifying the candidate data as repository data developed on a candidate list which includes repository data having a reference count number below a predetermined reference count threshold, the reference count number associated with a deduplication ratio; wherein the scanning comprises iterating through each of the subset of entities in the local deduplication repository, identifying whether the given reference count number of each of the subset of entities is below the predetermined reference count threshold, sorting the identified candidate data onto the candidate list according to the deduplication ratio and age information, and returning the sorted candidate list in an API response to the API command; and wherein repository data explicitly marked as excluded by a user is excluded from the candidate list notwithstanding whether the repository data explicitly marked as excluded has the reference count number below the predetermined reference count threshold. 2. The method of claim 1 , further including excluding repository data from the candidate list based on a predetermined age threshold associated with an age of the repository data. 3. The method of claim 1 , further including, when offloading the candidate data using a virtual tape library (VTL) system interface, performing: moving a cartridge containing identified candidate data from a VTL drive slot to an import/export (I/E) slot; migrating the identified candidate data from the cartridge to the object storage; removing the cartridge from the I/E slot; and communicating the status result of the migration using the API command. 4. The method of claim 1 , further including, when offloading the candidate data using a file system interface, exporting a mount point that presents content of the object storage using at least one of a common internet file system (CIFS), a server message block (SMB), and a network file system (NFS) protocol. 5. The method of claim 1 , further including offloading the candidate data when a repository capacity is greater than a predetermined repository capacity threshold. 6. The method of claim 1 , further including maintaining a mapping of the offloaded candidate data between the local deduplication repository and the object storage by updating the local deduplication repository metadata. 7. A system for managing data in a data deduplication repository in a computing storage environment, the system comprising: at least one processor device, wherein the at least one processor device: issues an application programming interface (API) command, by an existing backup management application executing on a host to a cloud deduplicating gateway, to scan metadata by the cloud deduplicating gateway of a subset of entities in a local deduplication repository stored on one or more storage devices associated with the host for identifying candidate data to offload from the local deduplication repository to an object storage, offloading the candidate data to the object storage, and returning a status result using the API command to the existing backup management application; wherein the local deduplication repository is stored on-premise and the object storage is stored off-premise such that the offloading is performed to lower a consumption of on-premise storage by migrating the candidate data identified during the scan of the metadata of the subset of entities to the object storage being stored off-premise; and according to an output of the scanning, identifies the candidate data as repository data developed on a candidate list which includes repository data having a reference count number below a predetermined reference count threshold, the reference count number associated with an overall system deduplication ratio; wherein the scanning comprises iterating through each of the subset of entities in the local deduplication repository, identifying whether the given reference count number of each of the subset of entities is below the predetermined reference count threshold, sorting the identified candidate data onto the candidate list according to the deduplication ratio and age information, and returning the sorted candidate list in an API response to the API command; and wherein repository data explicitly marked as excluded by a user is excluded from the candidate list notwithstanding whether the repository data explicitly marked as excluded has the reference count number below the predetermined reference count threshold. 8. The system of claim 7 , wherein the at least one processor device excludes repository data from the candidate list based on a predetermined age threshold associated with an age of the repository data. 9. The system of claim 7 , wherein the at least one processor device, when offloading the candidate data using a virtual tape library (VTL) system interface, performs: moving a cartridge containing identified candidate data from a VTL drive slot to an import/export (I/E) slot; migrating the identified candidate data from the cartridge to the object storage; removing the cartridge from the I/E slot; and communicating the status result of the migration using the API command. 10. The system of claim 7 , wherein the processor device, when offloading the candidate data using a file system interface, exports a mount point that presents content of the object storage using at least one of a common internet file system (CIFS), a server message block (SMB), and a network file system (NFS) protocol. 11. The system of claim 7 , wherein the at least one processor device offloads the candidate data when a repository capacity is greater than a predetermined repository capacity threshold. 12. The system of claim 7 , wherein the at least one processor device maintains a mapping of the offloaded candidate data between the local deduplication repository and the object storage by updating the local deduplication repository metadata. 13. A computer program product for managing data in a data deduplication repository in a computing storage environment, by a processor device, the computer program product embodied on a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that issues an application programming interface (API) command, by an existing backup management application executing on a host to a cloud deduplicating gateway, to scan metadata by the cloud deduplicating gateway of a subset of entities in a local deduplication repository s

Assignees

Inventors

Classifications

G06F16/1748Primary
De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title
G06F11/1453
using de-duplication of the data · CPC title
G06F16/215Primary
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

Patent family

Related publications grouped by family.

View patent family 59898757

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10956382B2 cover?: Various embodiments for managing data in a data deduplication repository in a computing storage environment, by a processor device, are provided. In one embodiment, a method comprises issuing an application programming interface (API) command to scan metadata of a subset of entities in a local deduplication repository for identifying candidate data to offload from the local deduplication reposi…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/1748. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Removal of reference information for storage blocks in a deduplication system

Integration of deduplicating backup server with cloud storage

Hybrid data backup in a networked computing environment

Frequently asked questions