Sub-cluster recovery using a partition group index
US-10852998-B2 · Dec 1, 2020 · US
US11340838B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11340838-B2 |
| Application number | US-202117336535-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 2, 2021 |
| Priority date | Feb 25, 2016 |
| Publication date | May 24, 2022 |
| Grant date | May 24, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The method disclosed is for instantiating a second cluster based on a first cluster. For at least one node of a second plurality of nodes, generating per node data based on mappings between a plurality of partition groups and a first plurality of nodes, the first plurality of nodes corresponding to the first cluster. The method further discloses identifying data items included in the plurality of partition groups based on the mappings between the plurality of partition groups and the first plurality of nodes. The method further discloses each partition group corresponding to a node of the first plurality of nodes and comprising a subset of data items stored in the node. The method further discloses loading the data items included in the plurality of partition groups onto the second plurality of nodes, the second plurality of nodes corresponding to the second cluster.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: instantiating a second cluster based on a first cluster, the instantiating the second cluster comprising: for at least one node of a second plurality of nodes, generating per node data based on mappings between a plurality of partition groups and a first plurality of nodes, the first plurality of nodes corresponding to the first cluster; identifying data items included in the plurality of partition groups based on the mappings between the plurality of partition groups and the first plurality of nodes, each partition group corresponding to a node of the first plurality of nodes and comprising a subset of data items stored in the node; and loading the data items included in the plurality of partition groups onto the second plurality of nodes, the second plurality of nodes corresponding to the second cluster. 2. The method of claim 1 , further comprising recovering each node of the first plurality of nodes by identifying the data items stored in the node based on the mappings between the plurality of partition groups and the first plurality of nodes. 3. The method of claim 2 , further comprising recovering each node of the first plurality of nodes by restoring the identified data items onto the node. 4. The method of claim 1 , wherein the instantiating the second cluster comprises generating a partition group to node mapping for the second cluster. 5. The method of claim 1 , further comprising: while scanning the data items, identifying duplicate data items in the cluster; deduplicating the duplicate data items; repackaging each of the duplicate data items into respective deduplicated data units; and storing the deduplicated data units to a secondary data repository. 6. The method of claim 5 , further comprising: determining a degree of duplicates of the data items in the cluster; and comparing the degree of duplicates to a predetermined level of consistency, wherein the deduplication is performed responsive to determining the degree of duplicates is greater than the predetermined level of consistency. 7. The method of claim 5 , wherein the partition groups include deduplicated data items. 8. The method of claim 5 , wherein storing the deduplicated data units comprises: storing a data version of the data items and; compiling the deduplicated data units into the data version of the data items. 9. The method of claim 1 , wherein the data items are stored in a No SQL data store. 10. A system comprising: at least one processor and executable instructions accessible on a computer- readable medium that, when executed, cause the at least one processor to perform operations comprising: instantiating a second cluster based on a first cluster, the instantiating the second cluster comprising: for at least one node of a second plurality of nodes, generating per node data based on mappings between a plurality of partition groups and a first plurality of nodes, the first plurality of nodes corresponding to the first cluster; identifying data items included in the plurality of partition groups based on the mappings between the plurality of partition groups and the first plurality of nodes, each partition group corresponding to a node of the first plurality of nodes and comprising a subset of data items stored in the node; and loading the data items included in the plurality of partition groups onto the second plurality of nodes, the second plurality of nodes corresponding to the second cluster. 11. The system of claim 10 , further comprising recovering each node of the first plurality of nodes by identifying the data items stored in the node based on the mappings between the plurality of partition groups and the first plurality of nodes. 12. The system of claim 11 , further comprising recovering each node of the first plurality of nodes by restoring the identified data items onto the node. 13. The system of claim 10 , wherein the instantiating the second cluster comprises generating a partition group to node mapping for the second cluster. 14. The system of claim 10 , further comprising: while scanning the data items, identifying duplicate data items in the cluster; deduplicating the duplicate data items; repackaging each of the duplicate data items into respective deduplicated data units; and storing the deduplicated data units to a secondary data repository. 15. The system of claim 14 , further comprising: determining a degree of duplicates of the data items in the cluster; and comparing the degree of duplicates to a predetermined level of consistency, wherein the deduplication is performed responsive to determining the degree of duplicates is greater than the predetermined level of consistency. 16. The system of claim 14 , wherein the partition groups include deduplicated data items. 17. The system of claim 14 , wherein storing the deduplicated data units comprises: storing a data version of the data items and; compiling the deduplicated data units into the data version of the data items. 18. The system of claim 10 , wherein the data items are stored in a No SQL data store. 19. A non-transitory machine-readable medium storing a set of instructions that, when executed by a processor, causes a machine to perform operations comprising: instantiating a second cluster based on a first cluster, the instantiating the second cluster comprising: for at least one node of a second plurality of nodes, generating per node data based on mappings between a plurality of partition groups and a first plurality of nodes, the first plurality of nodes corresponding to the first cluster; identifying data items included in the plurality of partition groups based on the mappings between the plurality of partition groups and the first plurality of nodes, each partition group corresponding to a node of the first plurality of nodes and comprising a subset of data items stored in the node; and loading the data items included in the plurality of partition groups onto the second plurality of nodes, the second plurality of nodes corresponding to the second cluster. 20. The non-transitory machine-readable medium of claim 19 , further comprising recovering each of the first plurality of nodes by identifying the data items stored in the node based on the mappings between the partition groups and the first plurality of nodes.
Backup restoration techniques · CPC title
Information retrieval; Database structures therefor; File system structures therefor · CPC title
Hardware arrangements for backup · CPC title
Improving or facilitating administration, e.g. storage management · CPC title
Database-specific techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.