Multiple subscriber data extraction for a change data capture (CDC)
US-12153597-B1 · Nov 26, 2024 · US
US2025371029A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025371029-A1 |
| Application number | US-202418679600-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 31, 2024 |
| Priority date | May 31, 2024 |
| Publication date | Dec 4, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computing system performs real-time replication of database management system transactions. The computing system includes a source DBMS, a replication service and a data lakehouse. The source DBMS stores at least one source transaction table recording at least one source transaction and generates at least one recovery log indicating at least one modification in the at least one source transaction table. The replication service replicates the at least one source transaction recorded in the at least one source transaction table by generating at least one data file having a first data format. The data lakehouse stores at least one lakehouse table corresponding to the at least one source transaction table and having a second data format different from the first data format, and to modify the at least one lakehouse table based on the at least one data file.
Opening claim text (preview).
1 . A computing system configured to perform real-time replication of database management system transactions, the computing system comprising: a source database management system (DBMS) configured to store at least one source transaction table recording at least one source transaction and to generate at least one recovery log indicating at least one modification in the at least one source transaction table; a replication service in signal communication with the source DBMS, the replication service including a first controller configured to replicate the at least one source transaction recorded in the at least one source transaction table by generating at least one data file having a first data format; a data lakehouse in signal communication with the replication service, the data lakehouse including a second controller configured to store at least one lakehouse table corresponding to the at least one source transaction table and having a second data format different from the first data format, and to modify the at least one lakehouse table based on the at least one data file, wherein the replication service propagates the at least one source transaction into the data lakehouse based on the at least one data file to generate at least one propagated source transaction; and wherein the data lakehouse generates the at least one lakehouse table including pointer that points to the at least one propagated source transaction propagated in the data lakehouse. 2 . The computing system of claim 1 , wherein: the replication service comprises an open format builder (OFB) configured to determine metadata corresponding to the at least one source transaction and to generate the at least one data file having an open data format; and the data lakehouse comprises an open table format (OTF) server configured to receive the metadata from the OFB and to modify the at least one lakehouse table based on the metadata. 3 . The computing system of claim 2 , wherein the replication service further comprises: a Capture module configured to extract the at least one source transaction from the at least one recovery log; an Apply module configured to receive the extracted at least one source transaction from the Capture module and to generate a parallel feed of independent batches of transactions corresponding to the at least one source transaction. 4 . The computing system of claim 3 , wherein the OFB is configured to determine at least one DBMS row change in the source transaction table associated based on the batches of transactions and to construct the at least one data file having the open data format indicating at least one row change in the source transaction table. 5 . The computing system of claim 4 , wherein the OFB adds timestamp data into the at least one data file having the open data format. 6 . The computing system of claim 5 , wherein adding the timestamp data includes creating a new column in the at least one source transaction table and inputting the timestamp data into the new column. 7 . The computing system of claim 4 , wherein the data lakehouse further comprises: an object storage unit configured to store the at least one data file having the open data format; and a metadata store configured to store the at least one lakehouse table. 8 . The computing system of claim 7 , wherein the metadata is stored in the at least one lakehouse table as listing pointers that point to the at least one data file stored in the object storage unit. 9 . A computer-implemented method of performing real-time replication of database management system transactions into a data lakehouse, the method comprising: storing in a source database management system (DBMS) at least one source transaction table recording at least one source transaction; generating at least one recovery log indicating at least one modification in the at least one source transaction table; generating at least one data file having a first data format to replicate the at least one source transaction recorded in the at least one source transaction table; propagating the at least one source transaction into the data lakehouse based on the at least one data file to generate at least one propagated source transaction; generating at least one lakehouse table including pointer that points to the at least one propagated source transaction propagated in the data lakehouse, the at least one lakehouse table corresponding to the at least one replicated source transaction table and having a second data format different from the first data format, and storing at least one lakehouse table in the data lakehouse; and modifying the at least one lakehouse table based on the at least one data file. 10 . The computer-implemented method of claim 9 , further comprising: determining by an open format builder (OFB) metadata corresponding to the at least one source transaction and to generate the at least one data file having an open data format; receiving by an open table format (OTF) server the metadata from the OFB; and modifying the at least one lakehouse table based on the metadata. 11 . The computer-implemented method of claim 10 , further comprising: extracting by a Capture module c the at least one source transaction from the at least one recovery log; receiving by an Apply module the extracted at least one source transaction from the Capture module; and generating by the Apply module a parallel feed of independent batches of transactions corresponding to the at least one source transaction. 12 . The computer-implemented method of claim 11 , further comprising: determining by the OFB at least one DBMS row change in the source transaction table associated based on the batches of transactions; and constructing the at least one data file having the open data format indicating at least one row change in the source transaction table. 13 . The computer-implemented method of claim 12 , further comprising adding timestamp data into the at least one data file having the open data format. 14 . The computer-implemented method of claim 13 , wherein adding the timestamp data includes creating a new column in the at least one source transaction table and inputting the timestamp data into the new column. 15 . The computer-implemented method of claim 12 , further comprising: storing the at least one data file having the open data format in an object storage unit; and storing the at least one lakehouse table in a metadata store. 16 . The computer-implemented method of claim 15 , wherein the metadata is stored in the at least one lakehouse table as listing pointers that point to the at least one data file stored in the object storage unit. 17 . A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising: storing in a source database management system (DBMS) at least one source transaction table recording at least one source transaction; generating at least one recovery log indicating at least one modification in the at least one source transaction table; generating at least one data file having a first data format to replicate the at least one source transaction recorded in the at least one source transaction table; propagating the at least one source transaction into the data lakehouse based on the at least one data file to generate at least one propagated source transaction; generating at least one lakehouse table including pointer that points to t
Updates performed during online database operations; commit processing · CPC title
Asynchronous replication or reconciliation · CPC title
Synchronous replication · CPC title
Data format conversion from or to a database · CPC title
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.