Platform, system, process for distributed graph databases and computing
US-2017364534-A1 · Dec 21, 2017 · US
US12386802B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12386802-B2 |
| Application number | US-202117225883-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 8, 2021 |
| Priority date | Oct 4, 2019 |
| Publication date | Aug 12, 2025 |
| Grant date | Aug 12, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A resource dependency system may track data dependencies and data transformations for individual columns of the data sets over the span of the data pipeline (referred to as a provenance or lineage of a column). Column provenance/lineage can be logged using metadata or graph-like data structures, which the resource dependency system can generate, store, manage, and access. Column provenance/lineage can be used to generate user interfaces displaying visual node graphs with columns as nodes and the data dependencies and data transformations associated with the columns as edges between the nodes.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of processing and managing data set column lineage, comprising: by one or more processors executing program instructions: generating user interface data useable for rendering a first portion of a graphical user interface comprising representations of one or more columns corresponding to one or more datasets; receiving via the first portion of the graphical user interface a user selection of a representation of a selected column from among the representations of the one or more columns; in response to the user selection, determining one or more first target columns from one or more first target datasets based on accessing column metadata associated with the one or more first target columns, wherein the one or more first target columns are dependent on the selected column according to a data dependency of a column lineage, the data dependency being associated with a data transformation applied to the selected column, wherein the column metadata of the one or more first target columns indicates the data transformation was applied to the selected column to transform the selected column into the one or more first target columns, wherein the column metadata includes an indication of transformation code defining a set of instructions to apply the data transformation to the selected column, wherein the transformation code is versioned and stored in data storage, wherein a version of the transformation code references a version of the selected column to which the data transformation is applied to link the version of the transformation code with the version of the selected column; and updating the first portion of the graphical user interface to further comprise: representations of the one or more first target datasets; representations of the one or more first target columns shown in relation to associated first target datasets of the one or more first target datasets, wherein each representation of the one or more first target columns is different from each representation of the one or more first target datasets; an arrow or edge from the representation of the selected column to the representations of the one or more first target columns, the arrow or edge indicating the data dependency associated with the data transformation applied to the selected column; and the transformation code comprising software code in one or more programming languages, the transformation code being retrieved based on the indication included in the column metadata. 2. The computer-implemented method of claim 1 , further comprising: determining one or more second target columns from one or more second target datasets, wherein the one or more second target columns are indirectly dependent on the selected column, wherein the first portion of the graphical user interface further comprises: representations of target datasets of the one or more second target datasets, wherein the representations of the target datasets of the one or more second target datasets appear on a first side of the selected column; representations of target columns of the one or more second target columns shown in relation to associated second target datasets; and for each representation of a target column of the one or more second target columns, an arrow or edge from the respective representation of the target column to a representation of a column from which the respective target column directly depends. 3. The computer-implemented method of claim 1 , wherein the representations of the target datasets of the one or more first target datasets appear on a first side of the selected column. 4. The computer-implemented method of claim 3 , further comprising: determining one or more first source columns from one or more first source datasets, wherein the selected column is dependent on the one or more first source columns, wherein the first portion of the graphical user interface further comprises: representations of source datasets of the one or more first source datasets, wherein the representations of source datasets of the one or more first source datasets appear on a second side of the selected column; representations of source columns of the one or more first source columns, wherein each representation of a source column appears within a corresponding representation of a source dataset of the one or more first source datasets that the respective source column is from; and an arrow or edge from the representations of the source columns of the one or more first source columns to the representation of the selected column. 5. The computer-implemented method of claim 4 , further comprising: determining one or more second source columns from one or more second source datasets, wherein the selected column is indirectly dependent on the one or more second source columns, wherein the first portion of the graphical user interface further comprises: representations of source datasets of the one or more second source datasets, wherein the representations of the source datasets of the one or more second source datasets appear on the second side of the selected column; representations of source columns of the one or more second source columns, wherein each representation of each source column of the one or more second source columns appears within a corresponding representation of a source dataset that the respective source column is from; and for each representation of a source column of the one or more second source columns, an arrow or edge from the respective representation of the source column to a representation of a column that is directly dependent on the respective source column. 6. The computer-implemented method of claim 4 , wherein determining the one or more first source columns involves accessing column metadata associated with the selected column. 7. A computing system configured for processing and managing data set column lineage, comprising: a computer readable storage medium having program instructions embodied therewith; and one or more processors configured to execute the program instructions to cause the computing system to: generate user interface data useable for rendering a first portion of a graphical user interface comprising representations of one or more columns corresponding to one or more datasets; receive via the first portion of the graphical user interface a user selection of a representation of a selected column from among the representations of the one or more columns; in response to the user selection, determine one or more first target columns from one or more first target datasets based on accessing column metadata associated with the one or more first target columns, wherein the one or more first target columns are dependent on the selected column according to a data dependency of a column lineage, the data dependency being associated with a data transformation applied to the selected column, wherein the column metadata of the one or more first target columns indicates the data transformation was applied to the selected column to transform the selected column into the one or more first target columns, wherein the column metadata includes an indication of transformation code defining a set of instructions to apply the data transformation to the selected column, wherein the transformation code is versioned and stored in data storage, wherein a version of the transformation code references a version of the selected column to which the data transformation is applied to link the version of the transformation code with the version of the selected column; and update the first portion of the graphical user interface to further comprise: representations of the one or more first target datasets; representations of the one or more first t
Needs-based resource requirements planning or analysis · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Browsing; Visualisation therefor (for navigating the web G06F16/954; browsing optimisation for the web G06F16/957) · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
using geographical or spatial information, e.g. location (spatiotemporally dependent retrieval from the web G06F16/9537) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.