Transforming timeseries and non-relational data to relational for complex and analytical query processing

US9529875B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9529875-B2
Application numberUS-201414153904-A
CountryUS
Kind codeB2
Filing dateJan 13, 2014
Priority dateJan 13, 2014
Publication dateDec 27, 2016
Grant dateDec 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for transforming time series data into data that is accessible by a data warehouse identifies a data table comprising the time series data. The system creates a virtual view of the data table where the time series data is represented as at least one standard relational table in the virtual view, where the virtual view is presented as a virtual table. The system partitions the virtual table into a plurality of virtual partitions according to a time interval. The virtual table is partitioned across a data time range, where the data time range comprises at least one time interval, and where each of the plurality of virtual partitions has a respective partition time range that spans the time interval. The virtual partitions are created to optimize loading of the data into the data warehouse by incrementally refreshing the data according to the respective partition time range.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for transforming time series data into data that is accessible by a data warehouse, implemented by a computing processor, the method comprising: identifying, by the computing processor, a time series database, wherein the time series database comprises a column representing data collected from at least one source, wherein the column contains at least one array for storing the collected data, wherein the at least one array in a row grows larger in the time series database as new data is collected instead of adding a new row as the new data is collected, wherein the collected data is time series data; creating, by the computing processor, a virtual view of the time series data table by transforming the collected data that is inaccessible by the data warehouse into time series data that is accessible by the data warehouse by representing the collected data from the time series database as at least one standard relational table in the virtual view, the virtual view presented as a virtual table, the virtual view stored the at least one standard relational table as an in-memory storage structure without any intermediate storage, wherein the time series data is not stored in the at least one standard relational table; grouping, by the computing processor, the time series data in the virtual view into a plurality of virtual partitions according to time intervals, wherein the time intervals are defined by a user, wherein each of the plurality of virtual partitions is across a data time range, wherein the data time range comprises at least one time interval, wherein each of the plurality of virtual partitions has a respective partition time range that spans the at least one time interval, the virtual partitions created to optimize loading of the collected new data into the data warehouse by incrementally refreshing the collected new data according to a respective partition time range associated with each of the plurality of virtual partitions, wherein the virtual partitions allow the time series data in the virtual view to be incrementally refreshed according to the at least one time interval whereas the time series data in the virtual view otherwise could not be incrementally refreshed; and displaying time series data that corresponds to a corresponding time interval by displaying a subset of the plurality of virtual partitions that span the corresponding time interval. 2. The method of claim 1 comprising: providing, by the computing processor, the plurality of virtual partitions to the data warehouse for analysis of the data via a data accelerator, wherein selection of the plurality of virtual partitions, based on the data time range, optimizes analysis of the data. 3. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: incrementally refreshing the data by extracting new time series data from the data table, according to the time interval; creating a new virtual partition that spans the time interval, the new virtual partition having the respective partition time range; and adding the new virtual partition to the plurality of virtual partitions. 4. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: incrementally refreshing the data by identifying the data time range as representing a chosen view of the data, wherein the chosen view comprises a subset of the plurality of virtual partitions, wherein each of the subset of the plurality of virtual partitions has a respective partition time range that is within the data time range. 5. The method of claim 4 comprising: detecting at least one virtual partition, within the subset of the plurality of virtual partitions, having the respective partition time range outside of the data time range; and removing the at least one virtual partition from the subset of the plurality of virtual partitions, wherein the at least one virtual partition is no longer represented within the chosen view. 6. The method of claim 4 comprising: detecting at least one virtual partition having the respective partition time range within the data time range, wherein the at least one virtual partition is not in the subset of the plurality of virtual partitions; and adding the at least one virtual partition to the subset of the plurality of virtual partitions, wherein the at least one virtual partition is now represented within the chosen view. 7. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: creating a partitioning calendar; associating the time interval to the partitioning calendar; assigning the partitioning calendar to the virtual table; and partitioning the virtual table, using the partitioning calendar, wherein each partition spans the time interval. 8. The method of claim 1 wherein grouping, by the computing processor, the time series data in the virtual view into the plurality of virtual partitions according to the time interval comprises: defining a time window selected to optimize an amount of relevant data that is loaded into the data warehouse; associating the time interval to the time window; and applying the time window to the virtual table to partition the virtual table into the plurality of virtual partitions according to the time interval. 9. A computer program product for transforming time series data into data that is accessible by a data warehouse, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith, the program code executable by a processor to: identify, by the computing processor, a time series database, wherein the time series database comprises a column representing data collected from at least one source, wherein the column contains at least one array for storing the collected data, wherein the at least one array in a row grows larger in the time series database as new data is collected instead of adding a new row as the new data is collected, wherein the collected data is time series data; create, by the computing processor, a virtual view of the time series data table by transforming the collected data that is inaccessible by the data warehouse into time series data that is accessible by the data warehouse, by representing the collected data from the time series database as at least one standard relational table in the virtual view, the virtual view presented as a virtual table, the virtual view stored as an in-memory storage structure without any intermediate storage, wherein the time series data is not stored in the at least one standard relational table; group, by the computing processor, the time series data in the virtual view into a plurality of virtual partitions according to time intervals, wherein the time intervals are defined by a user, wherein each of the plurality of virtual partitions is across a data time range, wherein the data time range comprises at least one time interval, wherein each of the plurality of virtual partitions has a respective partition time range that spans the at least one time interval, the virtual partitions created to optimize loading of the collected new data into the data warehouse by incrementally refreshing the collected new data according to a respective partition time range associated with each of the plurality of virtual partitions, wherein the virtual partitions allow the time series data in the virtual

Assignees

Inventors

Classifications

  • G06F16/254Primary

    Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Indexing; Data structures therefor; Storage structures · CPC title

  • Query optimisation · CPC title

  • Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title

  • Unary operations; Data partitioning operations · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9529875B2 cover?
A system for transforming time series data into data that is accessible by a data warehouse identifies a data table comprising the time series data. The system creates a virtual view of the data table where the time series data is represented as at least one standard relational table in the virtual view, where the virtual view is presented as a virtual table. The system partitions the virtual t…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).