Generation and graphical display of data transform provenance metadata
US-2020026790-A1 · Jan 23, 2020 · US
US12093279B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12093279-B2 |
| Application number | US-202318465089-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 11, 2023 |
| Priority date | Jun 22, 2017 |
| Publication date | Sep 17, 2024 |
| Grant date | Sep 17, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method comprises creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata, wherein the method is performed by one or more processors.
Opening claim text (preview).
What is claimed is: 1. A method of presenting column provenance graphically, comprising: creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata, wherein the method is performed by one or more processors. 2. The method of claim 1 , the metadata specifying identifiers, versions, inputs, and outputs of the column operations. 3. The method of claim 1 , the graphical representation being a provenance graph including one or more nodes representing one or more columns of the columns and one or more edges representing one or more column relationships. 4. The method of claim 3 , the provenance graph further including a node representing a join operation and edges connecting a node of the one or more nodes representing the one or more columns to the node representing the join operation. 5. The method of claim 1 , further comprising causing a concurrent display of identifiers the tables and code snippets from the code segments. 6. The method of claim 1 , the GUI allowing a user to select a table of the tables and view forward or backward relationships to other tables in the graphical representation. 7. The method of claim 1 , the GUI allowing a user to select a column in the graphical representation and updating the graphical representation in terms of position or granularity in response to selecting the column. 8. The method of claim 7 , further comprising retrieving source code corresponding to generation of the column via the metadata and causing the GUI to include the source code. 9. The method of claim 8 , the GUI allowing the user to modify or delete the source code. 10. The method of claim 1 , the GUI allowing a user to select a forward or backward control and view provenance of the columns before or after a particular transformation or a set of transformations. 11. One or more computer-readable non-transitory storage media storing instructions which, when executed by one or more processors, cause execution of a method of presenting column provenance graphically, the method comprising: creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata. 12. The one or more computer-readable non-transitory storage media of claim 11 , the metadata specifying identifiers, versions, inputs, and outputs of the column operations. 13. The one or more computer-readable non-transitory storage media of claim 11 , the graphical representation being a provenance graph including one or more nodes representing one or more columns of the columns and one or more edges representing one or more column relationships. 14. The one or more computer-readable non-transitory storage media of claim 13 , the provenance graph further including a node representing a join operation and edges connecting a node of the one or more nodes representing the one or more columns to the node representing the join operation. 15. The one or more computer-readable non-transitory storage media of claim 11 , further comprising causing a concurrent display of identifiers the tables and code snippets from the code segments. 16. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a table of the tables and view forward or backward relationships to other tables in the graphical representation. 17. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a column in the graphical representation and updating the graphical representation in terms of position or granularity in response to selecting the column. 18. The one or more computer-readable non-transitory storage media of claim 17 , the method further comprising retrieving source code corresponding to generation of the column via the metadata and causing the GUI to include the source code. 19. The one or more computer-readable non-transitory storage media of claim 18 , the GUI allowing the user to modify or delete the source code. 20. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a forward or backward control and view provenance of the columns before or after a particular transformation or a set of transformations.
Data format conversion from or to a database · CPC title
with details for data modelling support · CPC title
Tablespace storage structures; Management thereof · CPC title
Column-oriented storage; Management thereof · CPC title
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.