Who is the assignee on this patent?

Bhide Manish A, Mittapalli Srinivas Kiran, Padmanabhan Sriram, and 1 more

What technology area does this patent fall under?

Primary CPC classification G06F16/254. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Star and snowflake schemas in extract, transform, load processes

US9323815B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9323815-B2
Application number	US-201213618282-A
Country	US
Kind code	B2
Filing date	Sep 14, 2012
Priority date	Nov 9, 2011
Publication date	Apr 26, 2016
Grant date	Apr 26, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method, computer program product and a system for supporting star and snowflake data schemas for use with an Extract, Transform, Load (ETL) process, comprising selecting a data source comprising dimensional data, where the dimensional data comprises at least one source table comprising at least one source column, importing a data model for the dimensional data into a data integration system, analyzing the imported data model to select a star or snowflake target data schema comprising target dimensions and target facts, generating a meta-model representation by mapping at least one source table or source column to each target fact and target dimension, automatically converting the meta-model representation into one or more ETL jobs, and executing the ETL jobs to extract the dimensional data from the data source and loading the dimensional data into the selected target data schema in a target data system.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of supporting star and snowflake data schemas for use with an Extract, Transform, Load (ETL) process, comprising: selecting a data source comprising dimensional data, wherein the dimensional data comprises a plurality of dimensions and a plurality of source tables each comprising at least one source column; importing a data model for the dimensional data into a data integration system, wherein the data model includes information pertaining to dimension tables and fact tables; analyzing the imported data model to determine a target data schema for individual dimensions of the dimensional data, wherein the target data schema is a star data schema or a snowflake data schema, and wherein the target data schema comprises target dimensions and target facts, and wherein analyzing the imported data model includes: analyzing the imported data model and determining a quantity of dimension tables for the individual dimensions of the dimensional data; determining the star data schema as the target data schema and mapping a dimension of the dimensional data to the star data schema in response to the analyzing indicating the dimension contains a single dimension table; and determining the snowflake data schema as the target data schema and mapping a dimension of the dimensional data to the snowflake data schema in response to the analyzing indicating the dimension contains a plurality of dimension tables; wherein at least one dimension of the dimensional data is mapped to the star data schema and at least one other dimension of the dimensional data is mapped to the snowflake data schema; generating a meta-model representation by, for each target fact, mapping at least one source table or source column to the target fact, and for each target dimension, mapping at least one source table or source column to the target dimension; automatically converting the meta-model representation into one or more ETL jobs; and executing the one or more ETL jobs to extract the dimensional data from the data source and load the dimensional data into the determined target data schema for corresponding dimensions in a target data system. 2. The method of claim 1 , further comprising: identifying a major dimension for the dimensional data, and a hierarchy for the major dimension. 3. The method of claim 2 , wherein the identifying the major dimension comprises: comparing the dimensional data to each template in a set of predefined dimension templates using a similarity identification algorithm that generates a similarity score; and selecting the template that has the highest similarity score as the major dimension. 4. The method of claim 3 , wherein the similarity score is determined using a Dynamic Time Warping (DTW) method. 5. The method of claim 1 , further comprising: mapping the target dimensions and target facts to predefined dimension templates using a similarity score. 6. The method of claim 1 , wherein the generating the meta-model representation further comprises, for each target fact and each target dimension: determining a similarity score measuring the similarity between the target fact or target dimension and each source table; determining a similarity score measuring the similarity between the target fact or target dimension and each source column; and selecting the source table or source column with the highest similarity score as the source table or source column that is mapped to the target fact or target dimension. 7. The method of claim 1 , wherein the generating the meta-model representation further comprises, for each target fact and each target dimension: determining a similarity score measuring the similarity between the target fact or target dimension and each source table; determining a similarity score measuring the similarity between the target fact or target dimension and each source column; presenting all of the similarity scores to a user; and receiving a user indication of a selected source table or source column as the source table or source column that is mapped to the target fact or target dimension.

Assignees

Inventors

Classifications

G06F16/254Primary
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
G06F16/282
Hierarchical databases, e.g. IMS, LDAP data stores or Lotus Notes · CPC title
G06F16/283
Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP · CPC title
G06F17/30592
Physics · mapped topic
G06F17/30589
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 48224413

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9323815B2 cover?: A computer-implemented method, computer program product and a system for supporting star and snowflake data schemas for use with an Extract, Transform, Load (ETL) process, comprising selecting a data source comprising dimensional data, where the dimensional data comprises at least one source table comprising at least one source column, importing a data model for the dimensional data into a data…
Who is the assignee on this patent?: Bhide Manish A, Mittapalli Srinivas Kiran, Padmanabhan Sriram, and 1 more
What technology area does this patent fall under?: Primary CPC classification G06F16/254. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).