Infinite versioning by automatic coalescing
US-11068450-B2 · Jul 20, 2021 · US
US11977529B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11977529-B2 |
| Application number | US-202117367881-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 6, 2021 |
| Priority date | Jan 26, 2015 |
| Publication date | May 7, 2024 |
| Grant date | May 7, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments disclosed herein provide systems, methods, and computer readable media for infinite versioning by automatic coalescing. In a particular embodiment, a method provides determining an age range for a plurality of data versions stored in a secondary data repository and identifying first data versions of the plurality of data versions that are within the age range. The method further provides determining a compaction ratio for the first data versions and compacting the first data versions based on the compaction ratio.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: iteratively consolidating, according to a first predetermined schedule, at least a subset of a plurality of data versions stored in a secondary data repository; wherein a data version represents a current state of a data set at a time such data version was created; wherein at least a base data version of the plurality of data versions includes an entirety of the data set at the time such base data version was created; wherein one or more incremental data versions in the plurality of data versions subsequent to the base data version store a set of changes made to the data set since the first base data version or a preceding incremental data version; wherein iteratively consolidating at least the subset of the plurality of data versions comprises: identifying, for a first age range and the data set, a first set of data versions from among the plurality of data versions that meet one or more criteria for creating an additional data version that represents a state of the data set as of a time corresponding to a most recent data version included in the first set of data versions, the additional data version being a first consolidated data version; and using the identified first set of data versions to create the first consolidated data version, wherein the first set of data versions comprises the most recent data version and one or more earlier data versions that are less recent than the most recent data version, and wherein the first consolidated data version: includes each change from the most recent data version included in the first set of data versions, includes each change from the one or more earlier data versions within the first set of data versions that was not overwritten by the most recent data version and also was not overwritten by any other data version within the first set of data versions that is less recent than the most recent data version, and excludes each change from the one or more earlier data versions within the first set of data versions that was overwritten by the most recent data version or was overwritten by any other data version within the first set of data versions that is less recent than the most recent data version; and storing the first consolidated data version in the secondary data repository. 2. The method of claim 1 further comprising: obtaining, during a data backup operation, the plurality of data versions from a primary data repository. 3. The method of claim 2 wherein the plurality of data versions are obtained from the primary data repository periodically based on a second predetermined schedule. 4. The method of claim 1 wherein at least one consolidated data version persists in the secondary data repository after iteratively consolidating at least the subset of the plurality of data versions. 5. The method of claim 1 , wherein the data set is stored in a primary data repository separate from the secondary data repository. 6. The method of claim 1 wherein the first age range is a user-configurable parameter. 7. The method of claim 1 wherein the first consolidated data version is available to restore the data set up to a restore point associated with the first consolidated data version. 8. The method of claim 1 , wherein using the identified first set of data versions to create the first consolidated data version comprises: generating a list of logical block address ranges associated with the first set of data versions, each logical block address mapping to a physical location in the secondary data repository. 9. The method of claim 1 , further comprising: determining a first compaction ratio defining a number of data versions to be consolidated into the first consolidated data version, wherein using the identified first set of data versions to create the first consolidated data version comprises using the number of data versions to create the first consolidated data version. 10. The method of claim 9 , wherein using the identified first set of data versions to create the first consolidated data version comprises: grouping the first set of data versions into one or more sequential data version groups each including the number of data versions of the first set of data versions; and for each sequential data version group: removing overwritten changes of the number of data versions. 11. A non-transitory computer readable storage medium having instructions stored thereon, the instructions, when executed by a system, directing the system to perform operations, the operations including at least: iteratively consolidating, according to a first predetermined schedule, at least a subset of a plurality of data versions stored in a secondary data repository; wherein a data version represents a current state of a data set at a time such data version was created; wherein at least a base data version of the plurality of data versions includes an entirety of the data set at the time such base data version was created; wherein one or more incremental data versions in the plurality of data versions subsequent to the base data version store a set of changes made to the data set since the first base data version or a preceding incremental data version; wherein iteratively consolidating at least the subset of the plurality of data versions comprises: identifying, for a first age range and the data set, a first set of data versions from among the plurality of data versions that meet one or more criteria for creating an additional data version that represents a state of the data set as of a time corresponding to a most recent data version included in the first set of data versions, the additional data version being a first consolidated data version; and using the identified first set of data versions to create the first consolidated data version, wherein the first set of data versions comprises the most recent data version and one or more earlier data versions that are less recent than the most recent data version, and wherein the first consolidated data version: includes each change from the most recent data version included in the first set of data versions, includes each change from the one or more earlier data versions within the first set of data versions that was not overwritten by the most recent data version or by any other data version within the first set of data versions that is less recent than the most recent data version, and excludes each change from the one or more earlier data versions within the first set of data versions that was overwritten by the most recent data version or by any other data version within the first set of data versions that is less recent than the most recent data version; and storing the first consolidated data version in the secondary data repository. 12. The medium of claim 11 , wherein the operations further comprise: obtaining, during a data backup operation, the plurality of data versions from a primary data repository. 13. The medium of claim 12 , wherein the instructions, when executed by the data consolidation system, direct the data consolidation system to obtain the plurality of data versions from the primary data repository periodically based on a second predetermined schedule. 14. The medium of claim 11 , wherein the instructions, when executed by the data consolidation system, cause at least one first consolidated version to persist in the secondary data repository after at least the subset of the plurality of data versions are iteratively consolidated. 15. The medium of claim 11 , wherein the instructions, when executed by the data consolidation system, dir
Managing data history or versioning (querying versioned data G06F16/2474; querying temporal data G06F16/2477) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.