Optimizing execution of data transformation flows

US10242079B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10242079-B2
Application numberUS-201715701381-A
CountryUS
Kind codeB2
Filing dateSep 11, 2017
Priority dateNov 7, 2016
Publication dateMar 26, 2019
Grant dateMar 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer system transforms data. The system displays a user interface including a data flow pane. A user builds a flow diagram in the data flow pane. Each node in the flow diagram specifies an operation: to retrieve data, to transform data, or to create an output dataset. The flow diagram includes a subtree having a data source node and transformation operation nodes. When the user initiates execution and the nodes in the subtree are configured to execute imperatively, the system performs the operations in the subtree sequentially as specified, retrieving data from the data source, transforming the data, and forming an intermediate dataset. When the user initiates execution and the nodes in the subtree are configured to execute declaratively, the system constructs a database query that is logically equivalent to the operations specified in the subtree and transmits the query to the data source to retrieve the intermediate dataset.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system for transforming data, comprising: one or more processors; memory; and one or more programs stored in the memory and configured for execution by the one or more processors, the one or more programs comprising instructions for: displaying a user interface that includes a data flow pane and a data pane; receiving first user input to build a node/link data transformation flow diagram in the data flow pane, wherein each node in the flow diagram specifies a respective operation to retrieve data from a respective data source, specifies a respective operation to transform data, or specifies a respective operation to create a respective output data set, and wherein the flow diagram includes a subtree having one or more data source nodes that retrieve data from a first data source and one or more transformation operation nodes; receiving a second user input to execute at least the subtree; in accordance with the second user input and a determination that the nodes in the subtree are configured to execute imperatively: performing the operations of the nodes in the subtree sequentially as specified by links in the subtree, thereby retrieving data from the first data source, transforming the retrieved data, and forming a first intermediate data set; and displaying the first intermediate data set in the data pane; receiving third user input to configure the nodes in the subtree to execute declaratively; receiving a fourth user input to execute at least the subtree; and in accordance with the fourth user input and a determination that the nodes in the subtree are configured to execute declaratively: constructing a database query that is logically equivalent to the operations specified by the nodes in the subtree; transmitting the database query to the first data source to retrieve a second intermediate data set from the first data source according to the database query; and displaying the second intermediate data set in the data pane. 2. The computer system of claim 1 , wherein the one or more programs further comprise instructions for storing the first intermediate data set and the second intermediate data set. 3. The computer system of claim 1 , wherein the subtree is the entire flow diagram. 4. The computer system of claim 1 , wherein: one of the transformation operation nodes specifies a filter transformation operation to filter rows of data received by the one node; performing the operations of the nodes in the subtree sequentially includes performing the filter transformation operation at the computer system to filter out received rows of data; and retrieving the second intermediate data set from the first data source according to the database query includes applying the filter operation at a remote server hosting the first data source. 5. The computer system of claim 1 , wherein: one of the transformation operation nodes specifies a join transformation operation that joins two sets of data from the first data source; performing the operations of the nodes in the subtree sequentially includes performing the join transformation operation to combine the two sets of data at the computer system; and retrieving the second intermediate data set from the first data source according to the database query includes applying the join operation at a remote server hosting the first data source. 6. The computer system of claim 1 , wherein the database query is written in SQL. 7. The computer system of claim 1 , wherein: the flow diagram includes a portion not included in the subtree; the portion is configured to execute imperatively; the fourth user input specifies execution of the entire flow diagram; and executing the flow diagram includes performing the operations of the nodes in the portion sequentially as specified by links in the portion, thereby accessing the second intermediate data set, transforming the second intermediate data set, and forming a final data set. 8. A method of transforming data, comprising: at a computer system having a display, one or more processors, and memory storing one or more programs configured for execution by the one or more processors: displaying a user interface that includes a data flow pane and a data pane; receiving first user input to build a node/link data transformation flow diagram in the data flow pane, wherein each node in the flow diagram specifies a respective operation to retrieve data from a respective data source, specifies a respective operation to transform data, or specifies a respective operation to create a respective output data set, and wherein the flow diagram includes a subtree having one or more data source nodes that retrieve data from a first data source and one or more transformation operation nodes; receiving a second user input to execute at least the subtree; in accordance with the second user input and a determination that the nodes in the subtree are configured to execute imperatively: performing the operations of the nodes in the subtree sequentially as specified by links in the subtree, thereby retrieving data from the first data source, transforming the retrieved data, and forming a first intermediate data set; and displaying the first intermediate data set in the data pane; receiving third user input to configure the nodes in the subtree to execute declaratively; receiving a fourth user input to execute at least the subtree; and in accordance with the fourth user input and a determination that the nodes in the subtree are configured to execute declaratively: constructing a database query that is logically equivalent to the operations specified by the nodes in the subtree; transmitting the database query to the first data source to retrieve a second intermediate data set from the first data source according to the database query; and displaying the second intermediate data set in the data pane. 9. The method of claim 8 , further comprising storing the first intermediate data set and the second intermediate data set. 10. The method of claim 8 , wherein the subtree is the entire flow diagram. 11. The method of claim 8 , wherein: one of the transformation operation nodes specifies a filter transformation operation to filter rows of data received by the one node; performing the operations of the nodes in the subtree sequentially includes performing the filter transformation operation at the computer system to filter out received rows of data; and retrieving the second intermediate data set from the first data source according to the database query includes applying the filter operation at a remote server hosting the first data source. 12. The method of claim 8 , wherein: one of the transformation operation nodes specifies a join transformation operation that joins two sets of data from the first data source; performing the operations of the nodes in the subtree sequentially includes performing the join transformation operation to combine the two sets of data at the computer system; and retrieving the second intermediate data set from the first data source according to the database query includes applying the join operation at a remote server hosting the first data source. 13. The method of claim 8 , wherein the database query is written in SQL. 14. The method of claim 8 , wherein: the flow diagram includes a portion not included in the subtree; the portion is configured to execute imperatively; the fourth user input specifies execution of the entire flow diagram; and executing the flow diagram includes performing the operations of the nodes in the portion sequentially as specif

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10242079B2 cover?
A computer system transforms data. The system displays a user interface including a data flow pane. A user builds a flow diagram in the data flow pane. Each node in the flow diagram specifies an operation: to retrieve data, to transform data, or to create an output dataset. The flow diagram includes a subtree having a data source node and transformation operation nodes. When the user initiates …
Who is the assignee on this patent?
Tableau Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/258. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).