Data processing method, data processing apparatus, and non-transitory computer-readable storage medium
US-2024320235-A1 · Sep 26, 2024 · US
US9529875B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9529875-B2 |
| Application number | US-201414153904-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 13, 2014 |
| Priority date | Jan 13, 2014 |
| Publication date | Dec 27, 2016 |
| Grant date | Dec 27, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system for transforming time series data into data that is accessible by a data warehouse identifies a data table comprising the time series data. The system creates a virtual view of the data table where the time series data is represented as at least one standard relational table in the virtual view, where the virtual view is presented as a virtual table. The system partitions the virtual table into a plurality of virtual partitions according to a time interval. The virtual table is partitioned across a data time range, where the data time range comprises at least one time interval, and where each of the plurality of virtual partitions has a respective partition time range that spans the time interval. The virtual partitions are created to optimize loading of the data into the data warehouse by incrementally refreshing the data according to the respective partition time range.
Opening claim text (preview).
What is claimed is: 1. A method for transforming time series data into data that is accessible by a data warehouse, implemented by a computing processor, the method comprising: identifying, by the computing processor, a time series database, wherein the time series database comprises a column representing data collected from at least one source, wherein the column contains at least one array for storing the collected data, wherein the at least one array in a row grows larger in the time series database as new data is collected instead of adding a new row as the new data is collected, wherein the collected data is time series data; creating, by the computing processor, a virtual view of the time series data table by transforming the collected data that is inaccessible by the data warehouse into time series data that is accessible by the data warehouse by representing the collected data from the time series database as at least one standard relational table in the virtual view, the virtual view presented as a virtual table, the virtual view stored the at least one standard relational table as an in-memory storage structure without any intermediate storage, wherein the time series data is not stored in the at least one standard relational table; grouping, by the computing processor, the time series data in the virtual view into a plurality of virtual partitions according to time intervals, wherein the time intervals are defined by a user, wherein each of the plurality of virtual partitions is across a data time range, wherein the data time range comprises at least one time interval, wherein each of the plurality of virtual partitions has a respective partition time range that spans the at least one time interval, the virtual partitions created to optimize loading of the collected new data into the data warehouse by incrementally refreshing the collected new data according to a respective partition time range associated with each of the plurality of virtual partitions, wherein the virtual partitions allow the time series data in the virtual view to be incrementally refreshed according to the at least one time interval whereas the time series data in the virtual view otherwise could not be incrementally refreshed; and displaying time series data that corresponds to a corresponding time interval by displaying a subset of the plurality of virtual partitions that span the corresponding time interval. 2. The method of claim 1 comprising: providing, by the computing processor, the plurality of virtual partitions to the data warehouse for analysis of the data via a data accelerator, wherein selection of the plurality of virtual partitions, based on the data time range, optimizes analysis of the data. 3. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: incrementally refreshing the data by extracting new time series data from the data table, according to the time interval; creating a new virtual partition that spans the time interval, the new virtual partition having the respective partition time range; and adding the new virtual partition to the plurality of virtual partitions. 4. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: incrementally refreshing the data by identifying the data time range as representing a chosen view of the data, wherein the chosen view comprises a subset of the plurality of virtual partitions, wherein each of the subset of the plurality of virtual partitions has a respective partition time range that is within the data time range. 5. The method of claim 4 comprising: detecting at least one virtual partition, within the subset of the plurality of virtual partitions, having the respective partition time range outside of the data time range; and removing the at least one virtual partition from the subset of the plurality of virtual partitions, wherein the at least one virtual partition is no longer represented within the chosen view. 6. The method of claim 4 comprising: detecting at least one virtual partition having the respective partition time range within the data time range, wherein the at least one virtual partition is not in the subset of the plurality of virtual partitions; and adding the at least one virtual partition to the subset of the plurality of virtual partitions, wherein the at least one virtual partition is now represented within the chosen view. 7. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: creating a partitioning calendar; associating the time interval to the partitioning calendar; assigning the partitioning calendar to the virtual table; and partitioning the virtual table, using the partitioning calendar, wherein each partition spans the time interval. 8. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: defining a time window selected to optimize an amount of relevant data that is loaded into the data warehouse; associating the time interval to the time window; and applying the time window to the virtual table to partition the virtual table into the plurality of virtual partitions according to the time interval. 9. A computer program product for transforming time series data into data that is accessible by a data warehouse, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith, the program code executable by a processor to: identify, by the computing processor, a time series database, wherein the time series database comprises a column representing data collected from at least one source, wherein the column contains at least one array for storing the collected data, wherein the at least one array in a row grows larger in the time series database as new data is collected instead of adding a new row as the new data is collected, wherein the collected data is time series data; create, by the computing processor, a virtual view of the time series data table by transforming the collected data that is inaccessible by the data warehouse into time series data that is accessible by the data warehouse, by representing the collected data from the time series database as at least one standard relational table in the virtual view, the virtual view presented as a virtual table, the virtual view stored as an in-memory storage structure without any intermediate storage, wherein the time series data is not stored in the at least one standard relational table; group, by the computing processor, the time series data in the virtual view into a plurality of virtual partitions according to time intervals, wherein the time intervals are defined by a user, wherein each of the plurality of virtual partitions is across a data time range, wherein the data time range comprises at least one time interval, wherein each of the plurality of virtual partitions has a respective partition time range that spans the at least one time interval, the virtual partitions created to optimize loading of the collected new data into the data warehouse by incrementally refreshing the collected new data according to a respective partition time range associated with each of the plurality of virtual partitions, wherein the virtual partitions allow the time series data in the virtual
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
Query optimisation · CPC title
Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title
Unary operations; Data partitioning operations · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.