Bootstrapping the data lake and glossaries with ‘dataset joins’ metadata from existing application patterns

US9959324B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9959324-B2
Application numberUS-201514669096-A
CountryUS
Kind codeB2
Filing dateMar 26, 2015
Priority dateMar 26, 2015
Publication dateMay 1, 2018
Grant dateMay 1, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method to search for at least one relationship pattern in a plurality of runtime artifacts is provided. The method may include detecting at least one data manipulation statement in the plurality of runtime artifacts. The method may also include extracting at least one relationship clause from the detected at least one data manipulation statement. The method may further include parsing the extracted at least one relationship clause. The method may include generating at least one normalized syntax tree based on the parsed at least one relationship clause. The method may also include performing a classification and a snippet discovery on the generated at least one normalized syntax tree.

First claim

Opening claim text (preview).

What is claimed is: 1. A method to search for at least one relationship pattern in a plurality of runtime artifacts, the method comprising: detecting at least one data manipulation statement in the plurality of artifacts; extracting at least one relationship clause from the detected at least one data manipulation statement; parsing the extracted at least one relationship clause; generating at least one normalized syntax tree based on the parsed at least one relationship clause; and performing a classification and a snippet discovery on the generated at least one normalized syntax tree. 2. The method of claim 1 , further comprising: clustering a plurality of common relationship clauses according to a plurality of linkages that link a plurality of data sources together; and storing the plurality of common relationship clauses and the plurality of linkages. 3. The method of claim 2 , wherein the plurality of common relationship clauses and the plurality of linkages are stored according to at least one of a class, a cluster or a syntax tree. 4. The method of claim 2 , further comprising: proposing a list comprising the clustered common relationship clauses to a user for validation; and providing a search capability to retrieve at least one probable relationship between at least one of a plurality of data elements, a plurality of business terms, and a plurality of data elements and a plurality of business terms. 5. The method of claim 1 , wherein the plurality of runtime artifacts is associated with a data source comprising at least one of an ETL, a database view, a database SQL procedure, a batch file, a plurality of reporting tool metadata, a metadata server, a program, and a script. 6. The method of claim 1 , wherein performing a classification comprises classifying a plurality of common relationship clauses according to at least one of a uni-directional classification, a bi-directional classification, and a mapping classification. 7. The method of claim 2 , wherein the clustering the plurality of common relationship clauses is performed using a plurality of analytic algorithms. 8. A computer system to search for at least one relationship pattern in a plurality of runtime artifacts, the computer system comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage devices, and program instructions stored on at least one of the one or more storage devices for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: detecting at least one data manipulation statement in the plurality of runtime artifacts; extracting at least one relationship clause from the detected at least one data manipulation statement; parsing the extracted at least one relationship clause; generating at least one normalized syntax tree based on the parsed at least one relationship clause; and performing a classification and a snippet discovery on the generated at least one normalized syntax tree. 9. The computer system of claim 8 , further comprising: clustering a plurality of common relationship clauses according to a plurality of linkages that link a plurality of data sources together; and storing the plurality of common relationship clauses and the plurality of linkages. 10. The computer system of claim 9 , wherein the plurality of common relationship clauses and the plurality of linkages are stored according to at least one of a class, a cluster or a syntax tree. 11. The computer system of claim 9 , further comprising: proposing a list comprising the clustered common relationship clauses to a user for validation; and providing a search capability to retrieve at least one probable relationship between at least one of a plurality of data elements, a plurality of business terms, and a plurality of data elements and a plurality of business terms. 12. The computer system of claim 8 , wherein the plurality of runtimes is associated with a data source comprising at least one of an ETL, a database view, a database SQL procedure, a batch file, a plurality of reporting tool metadata, a metadata server, a program, and a script. 13. The computer system of claim 8 , wherein performing a classification comprises classifying a plurality of common relationship clauses according to at least one of a uni-directional classification, a bi-directional classification, and a mapping classification. 14. The computer system of claim 9 , wherein the clustering the plurality of common relationship clauses is performed using a plurality of analytic algorithms. 15. A computer program product, to search for at least one relationship pattern in a plurality of runtime artifacts, the computer program product comprising: one or more computer-readable storage devices and program instructions stored on at least one of the one or more tangible storage devices, the program instructions executable by a processor, the program instructions comprising: program instructions to detect at least one data manipulation statement in the plurality of runtime artifacts; program instructions to extract at least one relationship clause from the detected at least one data manipulation statement; program instructions to parse the extracted at least one relationship clause; program instructions to generate at least one normalized syntax tree based on the parsed at least one relationship clause; and program instructions to perform a classification and a snippet discovery on the generated at least one normalized syntax tree. 16. The computer program product of claim 15 , further comprising: program instructions to cluster a plurality of common relationship clauses according to a plurality of linkages that link a plurality of data sources together; and program instructions to store the plurality of common relationship clauses and the plurality of linkages. 17. The computer program product of claim 16 , wherein the plurality of common relationship clauses and the plurality of linkages are stored according to at least one of a class, a cluster or a syntax tree. 18. The computer program product of claim 16 , further comprising: program instructions to propose a list comprising the clustered common relationship clauses to a user for validation; and program instructions to provide a search capability to retrieve at least one probable relationship between at least one of a plurality of data elements, a plurality of business terms, and a plurality of data elements and a plurality of business terms. 19. The computer program product of claim 15 , wherein the plurality of runtime artifacts is associated with a data source comprising at least one of an ETL, a database view, a database SQL procedure, a batch file, a plurality of reporting tool metadata, a metadata server, a program, and a script. 20. The computer program product of claim 15 , wherein performing a classification comprises classifying a plurality of common relationship clauses according to at least one of a uni-directional classification, a bi-directional classification, and a mapping classification.

Assignees

Inventors

Classifications

  • Query processing support for facilitating data mining operations in structured databases · CPC title

  • of sub-queries or views · CPC title

  • of query operations · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9959324B2 cover?
A method to search for at least one relationship pattern in a plurality of runtime artifacts is provided. The method may include detecting at least one data manipulation statement in the plurality of runtime artifacts. The method may also include extracting at least one relationship clause from the detected at least one data manipulation statement. The method may further include parsing the ext…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/2465. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).