Storage device efficiency during data replication
US-2018081548-A1 · Mar 22, 2018 · US
US2018285201A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018285201-A1 |
| Application number | US-201815937796-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 27, 2018 |
| Priority date | Mar 28, 2017 |
| Publication date | Oct 4, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for performing backup and other secondary copy operations for large databases (e.g., “big data”), such as the Greenplum database, are described. In some cases, the systems and methods may maintain a second instance of a source database (e.g., Greenplum) using live synchronization (e.g., “Live Sync”), which performs incremental replication between a virtual machine containing a large database (e.g., a virtual machine containing a Greenplum database) and a synced copy of the virtual machine.
Opening claim text (preview).
What is claimed: 1 . A method for maintaining a secondary copy of a large database at a virtual machine, the method comprising: performing a full backup of a primary copy of the large database, wherein the large database is running at a source virtual machine; identifying, based on metadata associated with the full backup of the primary copy of the large database, objects of the database that have changed since an initial synchronization of the large database between the primary copy at the source virtual machine and a secondary copy running at a destination virtual machine; restoring the identified objects of the large database that have changed since the initial synchronization of the large database using the full backup of the primary copy of the large database; and replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine using live synchronization between the source virtual machine and the destination virtual machine. 2 . The method of claim 1 , further comprising: before performing the full backup performing a full synchronization between the primary copy of the large database at the source virtual machine and the secondary copy of the large database at the destination virtual machine. 3 . The method of claim 1 , wherein the large database is a Greenplum database, and wherein identifying objects of the large database that have changed since an initial synchronization of the large database includes identifying append only tables of the Greenplum database that have changed since the initial synchronization. 4 . The method of claim 1 , further comprising: performing one or more incremental backups after performance of the full backup of the primary copy of the large database; wherein identifying objects of the database that have changed since the initial synchronization of the large database includes identifying, within metadata associated with the one or more incremental backups of the primary copy of the large database, additional objects of the database that have changed since the performance of the full backup. 5 . The method of claim 1 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine using live synchronization includes performing continuous data replication on the restored objects during a running live synchronization. 6 . The method of claim 1 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine using live synchronization includes performing block-level replication on the restored objects during a running live synchronization. 7 . The method of claim 1 , further comprising: after identifying objects of the large database that have changed since an initial synchronization of the large database, updating entries of a changes index associated with a synchronization system to include information representative of the identified objects. 8 . The method of claim 1 , wherein performing a full backup of a primary copy of the large database includes performing a backup of a catalog of objects have changed within the large database that is managed by the large database. 9 . The method of claim 1 , wherein the full backup is performed on a daily schedule, and the replication is performed on a weekly schedule. 10 . The method of claim 1 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine including replication the restored objects includes replicating the restored objects using an enhanced data agent that is specific to the large database and installed at the source virtual machine. 11 . A system, comprising: at least one processor; at least one data storage device coupled to the at least one processor and storing instructions for implementing a process to maintain a secondary copy of a large database at a virtual machine, wherein the process comprises: performing a full backup of a primary copy of the large database at a source virtual machine, identifying, within metadata associated with the full backup of the primary copy of the large database, objects of the database that have changed since an initial synchronization of the large database between the primary copy at the source virtual machine and a secondary copy at a destination virtual machine, restoring the identified objects of the large database that have changed since the initial synchronization of the large database using the full backup of the primary copy of the large database; and replicating the restored objects to the secondary copy of the large database at the destination virtual machine using live synchronization between the source virtual machine and the destination virtual machine. 12 . The system of claim 11 , wherein the process further comprises: before performing the full backup performing a full synchronization between the primary copy of the large database at the source virtual machine and the secondary copy of the large database at the destination virtual machine. 13 . The system of claim 11 , wherein the large database is a Greenplum database, and wherein identifying objects of the large database that have changed since an initial synchronization of the large database includes identifying append only tables of the Greenplum database that have changed since the initial synchronization. 14 . The system of claim 11 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine using live synchronization includes performing continuous data replication on the restored objects during a running live synchronization. 15 . The system of claim 11 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine using live synchronization includes performing block-level replication on the restored objects during a running live synchronization. 16 . The system of claim 11 , wherein the process further comprises: after identifying objects of the large database, that have changed since an initial synchronization of the large database, updating entries of a changes index associated with a synchronization system to include information representative of the identified objects. 17 . The system of claim 11 , wherein performing a full backup of a primary copy of the large database includes performing a backup of a catalog of objects have changed within the large database that is managed by the large database. 18 . The system of claim 11 , wherein replicating the restored objects to the secondary copy of the large database contained at the destination virtual machine including replication the restored objects includes replicating the restored objects using an enhanced data agent that is specific to the large database and installed at the source virtual machine. 19 . A computer readable medium, excluding transitory propagating signals, storing instructions that, when executed by an information management system, cause the information management system to maintain synchronization between a Greenplum database stored at a source virtual machine and an instance of the Greenplum database stored at a destination virtual machine, the method comprising: creating a backup copy of the Greenplum database; identifying from the backup copy one or more objects of the Greenplum database that have changed since an ini
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
Memory management, e.g. access or allocation · CPC title
Physics · mapped topic
Using snapshots, i.e. a logical point-in-time copy of the data · CPC title
Hypervisor-specific management and integration aspects · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.