Generation and graphical display of data transform provenance metadata

US12093279B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12093279-B2
Application numberUS-202318465089-A
CountryUS
Kind codeB2
Filing dateSep 11, 2023
Priority dateJun 22, 2017
Publication dateSep 17, 2024
Grant dateSep 17, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method comprises creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata, wherein the method is performed by one or more processors.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of presenting column provenance graphically, comprising: creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata, wherein the method is performed by one or more processors. 2. The method of claim 1 , the metadata specifying identifiers, versions, inputs, and outputs of the column operations. 3. The method of claim 1 , the graphical representation being a provenance graph including one or more nodes representing one or more columns of the columns and one or more edges representing one or more column relationships. 4. The method of claim 3 , the provenance graph further including a node representing a join operation and edges connecting a node of the one or more nodes representing the one or more columns to the node representing the join operation. 5. The method of claim 1 , further comprising causing a concurrent display of identifiers the tables and code snippets from the code segments. 6. The method of claim 1 , the GUI allowing a user to select a table of the tables and view forward or backward relationships to other tables in the graphical representation. 7. The method of claim 1 , the GUI allowing a user to select a column in the graphical representation and updating the graphical representation in terms of position or granularity in response to selecting the column. 8. The method of claim 7 , further comprising retrieving source code corresponding to generation of the column via the metadata and causing the GUI to include the source code. 9. The method of claim 8 , the GUI allowing the user to modify or delete the source code. 10. The method of claim 1 , the GUI allowing a user to select a forward or backward control and view provenance of the columns before or after a particular transformation or a set of transformations. 11. One or more computer-readable non-transitory storage media storing instructions which, when executed by one or more processors, cause execution of a method of presenting column provenance graphically, the method comprising: creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; causing, after the executing, a presentation of a graphical user interface (GUI) including a graphical representation of the one or more data transforms based on the metadata. 12. The one or more computer-readable non-transitory storage media of claim 11 , the metadata specifying identifiers, versions, inputs, and outputs of the column operations. 13. The one or more computer-readable non-transitory storage media of claim 11 , the graphical representation being a provenance graph including one or more nodes representing one or more columns of the columns and one or more edges representing one or more column relationships. 14. The one or more computer-readable non-transitory storage media of claim 13 , the provenance graph further including a node representing a join operation and edges connecting a node of the one or more nodes representing the one or more columns to the node representing the join operation. 15. The one or more computer-readable non-transitory storage media of claim 11 , further comprising causing a concurrent display of identifiers the tables and code snippets from the code segments. 16. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a table of the tables and view forward or backward relationships to other tables in the graphical representation. 17. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a column in the graphical representation and updating the graphical representation in terms of position or granularity in response to selecting the column. 18. The one or more computer-readable non-transitory storage media of claim 17 , the method further comprising retrieving source code corresponding to generation of the column via the metadata and causing the GUI to include the source code. 19. The one or more computer-readable non-transitory storage media of claim 18 , the GUI allowing the user to modify or delete the source code. 20. The one or more computer-readable non-transitory storage media of claim 11 , the GUI allowing a user to select a forward or backward control and view provenance of the columns before or after a particular transformation or a set of transformations.

Assignees

Inventors

Classifications

  • Data format conversion from or to a database · CPC title

  • with details for data modelling support · CPC title

  • Tablespace storage structures; Management thereof · CPC title

  • Column-oriented storage; Management thereof · CPC title

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12093279B2 cover?
A method comprises creating metadata identifying columns of tables and column operations of one or more data transforms of the columns in a data pipeline and including links to code segments in human-readable form corresponding to the one or more data transforms; executing a build job that effects the one or more data transforms on one or more datasets to generate one or more derived datasets; …
Who is the assignee on this patent?
Palantir Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 17 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).