Data transfer using snapshot differencing from edge system to core system
US-11392541-B2 · Jul 19, 2022 · US
US11934362B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11934362-B2 |
| Application number | US-202117382344-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 22, 2021 |
| Priority date | Jul 22, 2021 |
| Publication date | Mar 19, 2024 |
| Grant date | Mar 19, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments for enabling granular migration of data with high efficiency. A defined metadata element, a tag, is assigned to each file, and then tag filtering is used to direct the data to the proper location. Files with different tags can be selected for transfer, and such a group of tags is referred to as a tag set. Embodiments can be used with a defined backup system file migration process, such as present in the Data Domain File System. By using snapshots, incoming new data (ingested file) is allowed to continue while the migration is in process and maintaining data consistency at the same time. This is achieved by performing operations on B+ Tree snapshots in conjunction with tag filtering on keys present in the leaf pages of these structures. This method is efficient became it makes a single pass walk of a B+ Tree in contrast with previous methods that look up files one-by-one via their pathname.
Opening claim text (preview).
What is claimed is: 1. A method of providing granular migration of data between a source and a destination in a network, comprising: tagging files to be migrated within an MTree of the source by attaching a tag comprising a metadata element to a respective file to create a tagged file; storing the tag as a 32-bit tag in a payload of a child key of a key/value pair, wherein the child key is tagged as a ‘migrate’ or ‘no migrate’ file; taking a first snapshot to gather all changes to a current time to an active MTree of the files including files to be transferred; migrating only files tagged as ‘migrate’ in the MTree from the source to the destination; write-locking the active MTree on the source; taking a second snapshot of the MTree; performing an MTree differencing operation to capture any changes that occurred while taking the first snapshot and migrating the MTree; maintaining the tag along with other tags in a tag set comprising a list of file tags and a list length; generating, for each of the first and second snapshot, a granular migration (GMIG) data structure in a umbrella tree (Utree) having a higher level tree structure than the MTree and containing the tag set used in migrating the MTree; deleting the migrated files from the active MTree in the source; and releasing the write lock from the active MTree on the source. 2. The method of claim 1 wherein each of the first snapshot and the second snapshot are consistent, frozen-in-time logical copy of an MTree B+ Tree, and wherein the Utree contains pointers to the MTree and the first snapshot and second snapshot. 3. The method of claim 1 further comprising: using a GMIG Utree attribute for the tag set for a completed migration; and maintaining a transfer checksum comprising the keys and associated payload sent in the migrated files, wherein the migrating moves the migrated files at a certain granularity with respect to dataset size. 4. The method of claim 3 further comprising performing a tag filtering process to move files within the network based on their tags using an MTree walk operation to identify the tags. 5. The method of claim 1 wherein only files tagged as ‘migrate’ are moved from the source to the destination the migration is a symmetric migration in which identical directory structures are used in both the source and the destination. 6. The method of claim 1 wherein the tag is applied by a data backup program executed in the network to files as they are created or modified. 7. The method of claim 6 wherein the tag identifies only those sub-directory granular portions of files that need to be moved in the migrating step as identified by the MTree differencing operation, such that only the files which have changed since a last migration are moved, and wherein the granularity is on the level of sub-directory or sub-sub-directory level. 8. A method migrating data from a source to a destination in a network, comprising assigning a tag to each file in a directory based repository of the source, the tag indicating whether or not each corresponding file is to be migrated or not migrated, the repository organized as an MTree structure, wherein the tag is applied by a data backup program executed in the network to files as they are created or modified and identifies only those sub-directory granular portions of files that need to be moved in the migrating as identified by an MTree differencing operation, such that only the files which have changed since a last migration are moved; attaching the tag to a respective file to create a tagged file as a payload of a child key of a key value pair; walking the MTree to identify the tags; performing a tag filtering process on the identified tags to direct the file data to a proper location of the destination for files tagged to be migrated; taking a series of snapshots of the MTree to allow processing of the file data to continue while any migration is in process to maintain data consistency in the network; maintaining the tag along with other tags in a tag set comprising a list of file tags and a list length; and generating, for each snapshot of the series of snapshots, a granular migration (GMIG) data structure in a umbrella tree (Utree) having a higher level tree structure than the MTree and containing the tag set used in migrating the files. 9. The method of claim 8 wherein the tag having a format of parent_ID:child_ID, and further wherein the Utree contains pointers to the MTree and each snapshot, the method further comprising: using a GMIG Utree attribute for the tag set for a completed migration; and maintaining a transfer checksum comprising the keys and associated payload sent in the migrated files, wherein the migrating moves the migrated files at a certain granularity with respect to dataset size. 10. The method of claim 8 wherein each snapshot of the series of snapshots are consistent, frozen-in-time logical copy of an MTree B+ Tree of the source, and further wherein the migration is a symmetric migration in which identical directory structures are used in both the source and the destination. 11. A system migrating data from a source to a destination in a network, comprising a repository of the source storing data organized in a hierarchy comprising directories and sub-directories including individual files; a hardware-based granular migration processing component assigning a tag to each file in a directory, the tag indicating whether or not each corresponding file is to be migrated or not migrated, and walking the MTree to identify the tags, wherein the tag is applied by a data backup program executed in the network to files as they are created or modified, and wherein the tag identifies only those sub-directory granular portions of files that need to be moved in the migrating as identified by an MTree differencing operation, such that only the files which have changed since a last migration are moved, and further wherein the tag comprises a metadata element attached to a respective file to create a tagged file as a payload of a child key of a key/value pair comprising a parent_ID:child_ID; a tag filtering processing component performing a tag filtering process on the identified tags to direct the file data to a proper location of the destination for files tagged to be migrated; and a backup processing component taking a series of snapshots of the MTree to allow processing of the file data to continue while any migration is in process to maintain data consistency in the network, and maintaining the tag along with other tags in a tag set comprising a list of file tags and a list length, and further generating, for each snapshot of the series of snapshots, a granular migration (GMIG) data structure in a umbrella tree (Utree) having a higher level tree structure than the MTree and containing the tag set used in migrating the files. 12. The system of claim 11 wherein each snapshot of the series of snapshots are consistent, frozen-in-time logical copy of an MTree B+ Tree of the source, and further wherein the Utree contains pointers to the MTree and each snapshot, and further wherein a GMIG Utree attribute is used for the tag set for a completed migration, and a transfer checksum is maintained comprising the keys and associated payload sent in the migrated files, wherein the migrating moves the migrated files at a certain granularity with respect to dataset size.
Database migration support · CPC title
Trees, e.g. B+trees · CPC title
Change logging, detection, and notification (replication G06F16/27) · CPC title
using data annotations, e.g. user-defined metadata · CPC title
Entity relationship models · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.