Dynamically switching between data sources
US-10360231-B2 · Jul 23, 2019 · US
US11036752B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11036752-B2 |
| Application number | US-201615156992-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 17, 2016 |
| Priority date | Jul 6, 2015 |
| Publication date | Jun 15, 2021 |
| Grant date | Jun 15, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In various embodiments, a data integration system is disclosed which enables incremental loads into a data warehouse by developing a data partitioning plan and selectively disabling and enabling indexes to facilitate incremental loads into fact tables.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: receiving, by a computing system, a data dictionary that specifies a partitioning strategy and defines a structure of one or more staging tables; receiving, from a source of a plurality of sources, data to be incrementally loaded into a data warehouse; determining which of the plurality of sources is to be the source of the data to be incrementally loaded; in accordance with a determination that the source is to be a local data store: switching, at runtime, from a first transport mode to a second mode, the switching to the second transport mode comprising switching from the source being a server to the source being a local data store; storing the received data in the one or more staging tables; determining, by the computing system, a plurality of partitions using the partitioning strategy specified by the data dictionary; enabling one or more indexes of the plurality of partitions; interpreting, by the computing system, the data dictionary to identify the data to be incrementally loaded into the data warehouse; executing a query to load one or more working tables from the one or more staging tables specified by the data dictionary; determining, by the computing system and in advance of incrementally loading the data, which partitions of the plurality of partitions are to be affected by the data to be incrementally loaded into the data warehouse based at least in part on analyzing an attribute of the data stored in the one or more staging tables, by at least determining, in advance of incrementally loading the data, a target partitioning strategy, the target partitioning strategy specifying: a record of the one or more staging tables to be refreshed into a target table; and a first data partition to contain the record after refreshing based at least in part on a match of the attribute with the affected partitions of the target table, the attribute associated with the record of the one or more staging tables to be refreshed into the target table; selectively disabling the one or more indexes on each affected partition of the plurality of partitions in order to load the data into the data warehouse by at least applying the target partitioning strategy to disable an index of the first data partition while loading the record into the target table; and re-enabling the one or more indexes on each affected partition of the plurality of partitions after the data is loaded into the data warehouse. 2. The method of claim 1 , wherein determining the plurality of partitions using the partitioning strategy specified by the data dictionary comprises determining one or more sub-partitions of a partition of the plurality of partitions. 3. The method of claim 1 , wherein determining which partitions of the plurality of partitions are affected by the data comprises determining which partitions comprise new data. 4. The method of claim 1 , wherein determining which partitions of the plurality of partitions are affected by the data comprises determining which partitions have changed data. 5. The method of claim 1 , wherein selectively disabling the one or more indexes on each affected partition of the plurality of partitions in order to load the data into the data warehouse comprises disabling bitmap indexes on affected partitions. 6. The method of claim 1 , wherein the one or more working tables are loaded from the one or more staging tables specified by the data dictionary prior to determining which partitions are affected by the data. 7. The method of claim 6 , wherein the affected partitions are stored in a fact table, and further comprising merging the one or more working tables into the fact table. 8. A non-transitory computer-readable medium storing program code that when executed by a processor of a computing system causes the processor to perform operations comprising: receiving a data dictionary that specifies a partitioning strategy and defines a structure of one or more staging tables; receiving, from a source of a plurality of sources, data to be incrementally loaded into a data warehouse; determining which of the plurality of sources is to be the source of the data to be incrementally loaded; in accordance with a determination that the source is to be a local data store: switching, at runtime, from a first transport mode to a second mode, the switching to the second transport mode comprising switching from the source being a server to the source being a local data store; storing the received data in the one or more staging tables; determining a plurality of partitions using the partitioning strategy specified by the data dictionary; enabling one or more indexes of the plurality of partitions; interpreting the data dictionary to identify the data to be incrementally loaded into the data warehouse; executing a query to load one or more working tables from the one or more staging tables specified by the data dictionary; determining, in advance of incrementally loading the data, which partitions of the plurality of partitions are to be affected by the data to be incrementally loaded into the data warehouse based at least in part on analyzing an attribute of the data stored in the one or more staging tables, by at least determining, in advance of incrementally loading the data, a target partitioning strategy, the target partitioning strategy specifying: a record of the one or more staging tables to be refreshed into a target table; and a first data partition to contain the record after refreshing based at least in part on a match of the attribute with the affected partitions of the target table, the attribute associated with the record of the one or more staging tables to be refreshed into the target table; selectively disabling the one or more indexes on each affected partition of the plurality of partitions in order to load the data into the data warehouse by at least applying the target partitioning strategy to disable an index of the first data partition while loading the record into the target table; and re-enabling the one or more indexes on each affected partition of the plurality of partitions after the data is loaded into the data warehouse. 9. The non-transitory computer-readable medium of claim 8 , wherein determining the plurality of partitions using the partitioning strategy specified by the data dictionary comprises determining one or more sub-partitions of a partition of the plurality of partitions. 10. The non-transitory computer-readable medium of claim 8 , wherein determining which partitions of the plurality of partitions are affected by the data comprises determining which partitions comprise new data. 11. The non-transitory computer-readable medium of claim 8 , wherein determining which partitions of the plurality of partitions are affected by the data comprises determining which partitions have changed data. 12. The non-transitory computer-readable medium of claim 8 , wherein determining which partitions of the plurality of partitions are affected comprises: scanning the staging table to identify incoming changes; and determining local indexes corresponding to the affected partitions. 13. The computer-readable medium of claim 12 , wherein the staging table is scanned by a loading knowledge module (LKM) to identify incoming changes, wherein the LKM implements a reusable loading strategy, a reusable transformation strategy, and a reusable extract, load, and transform (ELT) strategy, and wherein the reusable loading strategy can be developed for a first fact table and reused for different fact tables. 14. A system, comprising: a memory configured
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Ensuring data consistency and integrity · CPC title
Schema design and management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.