NoSQL relational database (RDB) data movement

US9904694B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9904694-B2
Application numberUS-201715404264-A
CountryUS
Kind codeB2
Filing dateJan 12, 2017
Priority dateDec 10, 2015
Publication dateFeb 27, 2018
Grant dateFeb 27, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Automatically moving NoSQL data store to a relational database system. Based on discovered data structure schema of a NoSQL file and query plans, attribute usage and association relationships may be determined. Trunk tables may be defined based on trunk part of the data structure schema determined based on the attribute usage. Trunk tables may be validated and relational database tables are generated that correspond to the trunk tables. NoSQL trunk template is generated based on the trunk tables. The relational database tables are loaded with data filtered from the NoSQL file according to the NoSQL trunk template.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method of automatically moving NoSQL data store to a relational database system, the method performed on one or more processors, comprising: receiving NoSQL file stored in a storage device; discovering data structure schema of the NoSQL file, wherein the data structure schema of the NoSQL file that is discovered comprises an object, an array, and a complex object comprising an embedded array of objects; receiving one or more query plans associated with one or more records in the NoSQL file; determining attribute usage associated with the data structure schema and association relationships in the data structure schema based on the one or more query plans; determining trunk part of the data structure schema based on the attribute usage; defining one or more trunk tables corresponding to the trunk part based on a rule-based table generation, wherein the rule-based table generation for the object comprises unfolding each attribute individually and adding a predefined character between names at each level, wherein the rule-based table generation for the array comprises extracting the array and storing the array vertically so that each value is in its own row, wherein the rule-based table generation for the complex object comprises extracting the embedded array of objects and storing the array of objects vertically so that each object is flattened; generating one or more corresponding relational database system tables corresponding to the one or more trunk tables; generating a NoSQL trunk template based on the one or more trunk tables; filtering the NoSQL file to extract data corresponding to the NoSQL trunk template; loading the data filtered from the NoSQL file into the one or more corresponding relational database system table; and loading the data that is not filtered in the NoSQL file into an overflow table with respective key value pairs. 2. The method of claim 1 , wherein the validating comprises: for each of the one or more trunk tables, determining whether a trunk table meets the constraints of the relational database system; responsive to determining that the trunk table meets the constraints, creating a relational database system table corresponding to the trunk table; responsive to determining that the trunk table does not meet the constraints, splitting the trunk table into a plurality of trunk tables based on the association relationships; and iterating the determining, creating and splitting until all of the plurality of trunk tables meet the constraints. 3. The method of claim 1 , further comprising storing the one or more corresponding relational database system tables in a data warehouse. 4. The method of claim 1 , wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays. 5. The method of claim 4 , wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays embedded in one another. 6. The method of claim 5 , wherein for an embedded object, the rule-based table generation comprises unfolding each attribute individually, adding a predefined character between names at each level. 7. The method of claim 5 , wherein for an embedded object in an array, the rule-based table generation comprises extracting arrays of complex objects and storing the arrays vertically so that each object is flattened. 8. The method of claim 5 , wherein for the rule-based table generation comprises, for the arrays, extracting the arrays and storing the arrays vertically so that each value is in its own row. 9. A computer readable storage medium storing a program of instructions executable by a machine to perform a method of automatically moving NoSQL data store to a relational database system, the method comprising: receiving NoSQL file stored in a storage device; discovering data structure schema of the NoSQL file, wherein the data structure schema of the NoSQL file that is discovered comprises an object, an array, and a complex object comprising an embedded array of objects; receiving one or more query plans associated with one or more records in the NoSQL file; determining attribute usage associated with the data structure schema and association relationships in the data structure schema based on the one or more query plans; determining trunk part of the data structure schema based on the attribute usage; defining one or more trunk tables corresponding to the trunk part based on a rule-based table generation, wherein the rule-based table generation for the object comprises unfolding each attribute individually and adding a predefined character between names at each level, wherein the rule-based table generation for the array comprises extracting the array and storing the array vertically so that each value is in its own row, wherein the rule-based table generation for the complex object comprises extracting the embedded array of objects and storing the array of objects vertically so that each object is flattened; generating one or more corresponding relational database system tables corresponding to the one or more trunk tables; and storing the one or more corresponding relational database system tables on a storage device. 10. The computer readable storage medium of claim 9 , further comprising: generating a NoSQL trunk template based on the one or more trunk tables; filtering the NoSQL file to extract data corresponding to the NoSQL trunk template; loading the data filtered from the NoSQL file into the one or more corresponding relational database system table; and loading the data that is not filtered in the NoSQL file into an overflow table with respective key value pairs. 11. The computer readable storage medium of claim 10 , further comprising storing the one or more corresponding relational database system table loaded with filtered data in a data warehouse. 12. The computer readable storage medium of claim 10 , wherein the validating comprises: for each of the one or more trunk tables, determining whether a trunk table meets the constraints of the relational database system; responsive to determining that the trunk table meets the constraints, creating a relational database system table corresponding to the trunk table; responsive to determining that the trunk table does not meet the constraints, splitting the trunk table into a plurality of trunk tables based on the association relationships; and iterating the determining, creating and splitting until all of the plurality of trunk tables meet the constraints. 13. The computer readable storage medium of claim 9 , wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays. 14. The computer readable storage medium of claim 13 , wherein the data structure schema of the NoSQL file that is discovered comprises objects and arrays embedded in one another. 15. The computer readable storage medium of claim 13 , wherein for an embedded object, the rule-based table generation comprises unfolding each attribute individually, adding a predefined character between names at each level. 16. A system of moving NoSQL data store to a relational database system tables, comprising: one or more processors; one or more of the processors operable to receive NoSQL file stored in a storage device, one or more of the processors further operable to discover data structure schema of the NoSQL file, wherein the data structure schema of the NoSQL file that is discovered comprises an object, an array, and a complex object comprising an embedded array of objects, one or more o

Assignees

Inventors

Classifications

  • G06F16/211Primary

    Schema design and management · CPC title

  • Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title

  • Query rewriting; Transformation · CPC title

  • Relational databases · CPC title

  • Applying rules; Deductive queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9904694B2 cover?
Automatically moving NoSQL data store to a relational database system. Based on discovered data structure schema of a NoSQL file and query plans, attribute usage and association relationships may be determined. Trunk tables may be defined based on trunk part of the data structure schema determined based on the attribute usage. Trunk tables may be validated and relational database tables are gen…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/211. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 27 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).