NoSQL relational database (RDB) data movement

US9607063B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9607063-B1
Application numberUS-201514965480-A
CountryUS
Kind codeB1
Filing dateDec 10, 2015
Priority dateDec 10, 2015
Publication dateMar 28, 2017
Grant dateMar 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Automatically moving NoSQL data store to a relational database system. Based on discovered data structure schema of a NoSQL file and query plans, attribute usage and association relationships may be determined. Trunk tables may be defined based on trunk part of the data structure schema determined based on the attribute usage. Trunk tables are validated and relational database tables are generated that correspond to the trunk tables. NoSQL trunk template is generated based on the trunk tables. The relational database tables are loaded with data filtered from the NoSQL file according to the NoSQL trunk template.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method of automatically moving NoSQL data store to a relational database system, the method performed on one or more processors, comprising: receiving NoSQL file stored in a storage device; discovering data structure schema of the NoSQL file, wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays, and objects and arrays embedded in one another; receiving one or more query plans associated with one or more records in the NoSQL file; determining attribute usage associated with the data structure schema and association relationships in the data structure schema based on the one or more query plans; determining trunk part of the data structure schema based on the attribute usage, the trunk part comprising attributes that occur in the query plan at least a defined threshold number of times; defining one or more trunk tables corresponding to the trunk part based on a rule-based table generation; wherein the rule-based table generation for an embedded object, comprises unfolding each attribute individually, adding a predefined character between names at each level, for the arrays, the rule-based table generation comprises, extracting the arrays and storing the arrays vertically so that each value is in its own row and for an embedded object in an array, the rule-based table generation comprises extracting arrays of complex objects and storing the arrays vertically so that each object is flattened; validating the one or more trunk tables according to constraints of the relational database system, the constraints comprising satisfying size requirements comprising a page size, length of table name, a number of columns and column name length associated with the relational database system, wherein the one or more trunk tables that exceed the size requirements for at least one of the page size, the number of columns, length of table name and column name length associated with the relational database system are split into additional trunk tables; generating one or more corresponding relational database system tables corresponding to the one or more trunk tables; generating a NoSQL trunk template based on the one or more trunk tables; filtering the NoSQL file to extract data corresponding to the NoSQL trunk template; loading the data filtered from the NoSQL file into the one or more corresponding relational database system table; and loading the data that is not filtered in the NoSQL file into an overflow table with respective key value pairs. 2. The method of claim 1 , wherein the validating comprises: for each of the one or more trunk tables, determining whether a trunk table meets the constraints of the relational database system; responsive to determining that the trunk table meets the constraints, creating a relational database system table corresponding to the trunk table; responsive to determining that the trunk table does not meet the constraints, splitting the trunk table into a plurality of trunk tables based on the association relationships; and iterating the determining, creating and splitting until all of the plurality of trunk tables meet the constraints. 3. The method of claim 1 , further comprising storing the one or more corresponding relational database system tables in a data warehouse. 4. A non-transitory computer readable storage medium storing a program of instructions executable by a machine to perform a method of automatically moving NoSQL data store to a relational database system, the method comprising: receiving NoSQL file stored in a storage device; discovering data structure schema of the NoSQL file, wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays, and objects and arrays embedded in one another; receiving one or more query plans associated with one or more records in the NoSQL file; determining attribute usage associated with the data structure schema and association relationships in the data structure schema based on the one or more query plans; determining trunk part of the data structure schema based on the attribute usage, the trunk part comprising attributes that occur in the query plan at least a defined threshold number of times; defining one or more trunk tables corresponding to the trunk part based on a rule-based table generation, wherein the rule-based table generation for an embedded object, comprises unfolding each attribute individually, adding a predefined character between names at each level, for the arrays, the rule-based table generation comprises, extracting the arrays and storing the arrays vertically so that each value is in its own row and for an embedded object in an array, the rule-based table generation comprises extracting arrays of complex objects and storing the arrays vertically so that each object is flattened; validating the one or more trunk tables according to constraints of the relational database system, the constraints comprising satisfying size requirements comprising a page size, length of table name, a number of columns and column name length associated with the relational database system, wherein the one or more trunk tables that exceed the size requirements for at least one of the page size, the number of columns, length of table name and column name length associated with the relational database system are split into additional trunk tables; generating one or more corresponding relational database system tables corresponding to the one or more trunk tables; generating a NoSQL trunk template based on the one or more trunk tables; filtering the NoSQL file to extract data corresponding to the NoSQL trunk template; loading the data filtered from the NoSQL file into the one or more corresponding relational database system table; and loading the data that is not filtered in the NoSQL file into an overflow table with respective key value pairs; and storing the one or more corresponding relational database system tables on a storage device. 5. The non-transitory computer readable storage medium of claim 4 , further comprising storing the one or more corresponding relational database system table loaded with filtered data in a data warehouse. 6. The non-transitory computer readable storage medium of claim 4 , wherein the validating comprises: for each of the one or more trunk tables, determining whether a trunk table meets the constraints of the relational database system; responsive to determining that the trunk table meets the constraints, creating a relational database system table corresponding to the trunk table; responsive to determining that the trunk table does not meet the constraints, splitting the trunk table into a plurality of trunk tables based on the association relationships; and iterating the determining, creating and splitting until all of the plurality of trunk tables meet the constraints. 7. A system of moving NoSQL data store to a relational database system tables, comprising: one or more hardware processors; one or more of the hardware processors operable to receive NoSQL file stored in a storage device, one or more of the hardware processors further operable to discover data structure schema of the NoSQL file, one or more of the hardware processors further operable to receive one or more query plans associated with one or more records in the NoSQL file, one or more of the hardware processors further operable to determine attribute usage associated with the data structure schema and association relationships in the data structure schema based on the one or more query plans, wherein the data structure schema of the NoSQL file comprises objects and arrays, and objects and arrays embedded in one another; one or more of the hardware

Assignees

Inventors

Classifications

  • G06F16/211Primary

    Schema design and management · CPC title

  • Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Query rewriting; Transformation · CPC title

  • Relational databases · CPC title

  • Applying rules; Deductive queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9607063B1 cover?
Automatically moving NoSQL data store to a relational database system. Based on discovered data structure schema of a NoSQL file and query plans, attribute usage and association relationships may be determined. Trunk tables may be defined based on trunk part of the data structure schema determined based on the attribute usage. Trunk tables are validated and relational database tables are genera…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/211. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).