Archive control techniques for database systems
US-2023161672-A1 · May 25, 2023 · US
US12367175B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12367175-B2 |
| Application number | US-202217951241-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 23, 2022 |
| Priority date | Sep 23, 2022 |
| Publication date | Jul 22, 2025 |
| Grant date | Jul 22, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An archival job is assessed to calculate loss of data reduction efficiency due to block-level data deduplication. Archivable data, or individual storage objects or data structures therein, are moved to archival storage contingent upon satisfaction of a predetermined condition related to data reduction efficiency. Archivable data, or individual storage objects or data structures therein, that fail to satisfy the predetermined condition are maintained in primary storage. The loss of data reduction efficiency and the predetermined condition may be expressed as a percentage of maximum possible data reduction that would result in the absence of data deduplication.
Opening claim text (preview).
What is claimed is: 1. A method comprising: in a storage system in which storage objects on primary storage archive to secondary storage only as single units and a first storage object contains archivable data, and a second storage object contains non-archivable data, wherein the archivable data and the non-archivable data share at least some duplicated data that has been consolidated into a single stored copy on the primary storage through deduplication: identifying that the archivable data and the non-archivable data have been deduplicated to reference the single stored copy of the duplicated data; determining, while data is static, whether archive of the first storage object is justified by: calculating potential primary storage data reduction that would result from retaining the single stored copy of the duplicated data on primary storage and moving only a non-duplicated portion of the archivable data from the primary storage to archival storage; and comparing the calculated potential primary storage data reduction that would result from retaining the single stored copy of the duplicated data on primary storage and moving only the non-duplicated portion of the archivable data from the primary storage to the archival storage with a predetermined condition; retaining the archivable data on the primary storage without moving any of the archivable data to the archival storage in response to determining that the calculated potential primary storage data reduction fails to satisfy the predetermined condition due to the amount of duplication between the archivable data and the non-archivable data; and copying the archivable data from the primary storage to the archival storage as the single unit and retaining the single stored copy of the duplicated data on primary storage in response to determining that the calculated potential primary storage data reduction satisfies the predetermined condition due to the amount of duplication between the archivable data and the non-archivable data. 2. The method of claim 1 further comprising identifying a set of data structures or the storage objects as archival candidates. 3. The method of claim 2 further comprising identifying inodes of the archival candidates and identifying blocks referenced by the inodes. 4. The method of claim 3 further comprising, for each referenced block, identifying an associated reference count of associations between the referenced block and the data structures or the storage objects on the primary storage. 5. The method of claim 4 further comprising incrementing a count of saved blocks in response to the reference count being equal to 1. 6. The method of claim 5 further comprising determining whether an entry for the referenced block has already been created in a hash table in response to the reference count being greater than 1. 7. The method of claim 6 further comprising creating an entry for the referenced block in the hash table in response to determining that no entry for the referenced block exists. 8. The method of claim 6 further comprising decrementing the reference count for the referenced block in the hash table in response to determining that an entry for the referenced block exists. 9. An apparatus comprising: primary storage comprising non-volatile media on which is stored storage objects that archive to secondary storage only as single units; a computer configured to manage access to data stored on the primary storage; a first storage object that contains archivable data and a second storage object that contains non-archivable data stored on the primary storage, and wherein the archivable data and the non-archivable data share at least some duplicated data that has been consolidated into a single stored copy on the primary storage through deduplication; and an archival data assessor comprising computer-executable instructions on a non-transitory computer-readable medium that determines, while data is static, whether archive of the first storage object is justified, the archival data assessor configured to: identify that the archivable data and the non-archivable data have been deduplicated to reference the single stored copy of the duplicated data; calculate potential primary storage data reduction that would result from retention of the single stored copy of the duplicated data on primary storage and movement of only a non-duplicated portion of the archivable data from the primary storage to archival storage; compare the calculated potential primary storage data reduction that would result from retention of the single stored copy of the duplicated data on primary storage and movement of only the non-duplicated portion of the archivable data from the primary storage to the archival storage with a predetermined condition; retain the archivable data on the primary storage without moving any of the archivable data to the archival storage in response to a determination that the calculated potential primary storage data reduction fails to satisfy the predetermined condition due to the amount of duplication of between the archivable data and the non-archivable data; and copy the archivable data from the primary storage to the archival storage as the single unit and retain the single stored copy of the duplicated data on primary storage in response to determining that the calculated potential primary storage data reduction satisfies the predetermined condition due to the amount of duplication between the archivable data and the non-archivable data. 10. The apparatus of claim 9 further comprising the archival data assessor configured to identify a set of data structures or the storage objects as archival candidates. 11. The apparatus of claim 10 further comprising the archival data assessor configured to identify inodes of the archival candidates and identify blocks referenced by the inodes. 12. The apparatus of claim 11 further comprising the archival data assessor configured to, for each referenced block, identify an associated reference count of associations between the referenced block and the data structures or the storage objects on the primary storage. 13. The apparatus of claim 12 further comprising the archival data assessor configured to increment a count of saved blocks in response to the reference count being equal to 1. 14. The apparatus of claim 13 further comprising the archival data assessor configured to determine whether an entry for the referenced block has already been created in a hash table in response to the reference count being greater than 1. 15. The apparatus of claim 14 further comprising the archival data assessor configured to create an entry for the referenced block in the hash table in response to determining that no entry for the referenced block exists. 16. The apparatus of claim 14 further comprising the archival data assessor configured to decrement the reference count for the referenced block in the hash table in response to determining that an entry for the referenced block exists. 17. A non-transitory computer-readable storage medium storing instructions that when executed by a computer cause the computer to perform a method comprising: in a storage system in which storage objects on primary storage archive to secondary storage only as single units and a first storage object contains archivable data and a second storage object contains non-archivable data, wherein the archivable data and the non-archivable data share at least some duplicated data that has been consolidated into a single stored copy on the primary storage through deduplication: ide
characterised by the use of retention policies (retention policies for HSM systems G06F16/185) · CPC title
for performance assessment · CPC title
Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.