Optimized menu planning
US-2015371164-A1 · Dec 24, 2015 · US
US9235652B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9235652-B1 |
| Application number | US-201313790763-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 8, 2013 |
| Priority date | Jun 27, 2012 |
| Publication date | Jan 12, 2016 |
| Grant date | Jan 12, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the present invention provide systems, methods and computer readable media for optimizing a data integration process. In embodiments, a system can be configured to represent the processing of a data record that includes attributes, and to use that representation to determine an optimal processing of that data record. In embodiments, the system represents the processing of a data record as an operator graph comprising nodes and edges, where each node is an operator node that represents an operator for implementing at least one logical operation on at least one of the attributes and each edge between a pair of nodes represents the movement of data between the nodes. In embodiments, each operator node includes one or more operator metrics (e.g. operator cost metrics and operator quality metrics). In embodiments, the system determines optimal processing of the data record by determining a best path within the operator graph.
Opening claim text (preview).
That which is claimed: 1. A computer-implemented method, comprising: receiving a data record comprising attributes; generating, for the data record, an operator graph comprising nodes and edges, wherein each node is an operator node that represents an operator for implementing at least one logical operation being performed on at least one of the attributes, wherein each operator node comprises operator metrics that include at least one of an operator cost metric and an operator quality metric, and wherein each edge between a pair of operator nodes corresponds to data movement between the operator nodes; generating an optimized operator graph for the data record based on the operator graph, wherein the optimized operator graph comprises a source node that represents a first operator node that receives the data record as input to the optimized operator graph and a sink node that represents a second operator node that produces an output from the optimized operator graph; and determining a best path from the source node to the sink node within the optimized operator graph, wherein the best path includes a subset of the operator nodes within the optimized operator graph, and wherein the determining the best path is based in part on at least one of the operator metrics associated with each respective operator node; and in an instance in which the determining the best path further comprises using a cost budget associated with the data record, receiving the cost budget associated with the data record; receiving a mapping between each cost budget of a set of cost budgets and a best path respectively associated with each cost budget; and determining the best path by selecting from the mapping the best path associated with the received cost budget; and in an instance in which the received mapping is a hash table, generating the hash table by: receiving the set of cost budgets; determining a set of possible paths between the source node and the sink node within the optimized operator graph, wherein each possible path P includes a respective subset of the operator nodes; calculating a respective quality of the path output Q(P) and a respective path cost C(P) for each possible path P of the set of possible paths, wherein the path cost C(P) is calculated using operator cost metrics respectively associated with each of the subset of the operator nodes; and for each cost budget in the set of cost budgets, selecting a subset of the possible paths between the source node and the sink node, wherein each possible path P in the selected subset has a calculated C(P) that is less than or equal to the cost budget; selecting a best path from the subset of possible paths, wherein the best path has the maximum calculated Q(P) of the subset of possible paths; and creating a hashtable entry associating the cost budget with the selected best path. 2. The method of claim 1 , further comprising: orchestrating a data integration task for the data record based on the determined best path. 3. The method of claim 1 , wherein the optimized operator graph is a directed line graph. 4. The method of claim 1 , wherein the optimized operator graph is a directed acyclic graph. 5. The method of claim 1 , wherein the mapping is a two-dimensional array of elements, wherein each element represents a cost budget and a respective best path calculated for the cost budget, and wherein generating the array comprises: receiving the set of cost budgets; for each cost budget in the set of cost budgets, calculating a best path P from the source node to the sink node by determining a subset of operator nodes for which the quality of the path output Q(P) is maximized and the path cost C(P) is less than or equal to the cost budget; and assigning the cost budget and the best path P to a respective element in the array. 6. The method of claim 1 , wherein an operator cost metric associated with an operator node is calculated based on consensus. 7. The method of claim 6 , wherein the consensus is crowd sourcing. 8. The method of claim 1 , wherein an attribute associated with the data record is a quality metric that is calculated based in part on the uniqueness of the data record. 9. The method of claim 1 , wherein the quality metric attribute associated with the data record is calculated based in part on a quality metric respectively associated with at least one of the attributes. 10. The method of claim 9 , wherein the quality metric respectively associated with the attribute is weighted. 11. A computer program product, stored on a non-transitory computer readable medium, comprising instructions that when executed on one or more computers cause the one or more computers to perform operations comprising: receiving a data record comprising attributes; generating, for the data record, an operator graph comprising nodes and edges, wherein each node is an operator node that represents an operator for implementing at least one logical operation being performed on at least one of the attributes, wherein each operator node comprises operator metrics that include at least one of an operator cost metric and an operator quality metric, and wherein each edge between a pair of operator nodes corresponds to data movement between the operator nodes; generating an optimized operator graph for the data record based on the operator graph, wherein the optimized operator graph comprises a source node that represents a first operator node that receives the data record as input to the optimized operator graph and a sink node that represents a second operator node that produces an output from the optimized operator graph; and determining a best path from the source node to the sink node within the optimized operator graph, wherein the best path includes a subset of the operator nodes within the optimized operator graph, and wherein the determining the best path is based in part on at least one of the operator metrics associated with each respective operator node; and in an instance in which the determining the best path further comprises using a cost budget associated with the data record, receiving the cost budget associated with the data record; receiving a mapping between each cost budget of a set of cost budgets and a best path respectively associated with each cost budget; and determining the best path by selecting from the mapping the best path associated with the received cost budget; and in an instance in which the received mapping is a hash table, generating the hash table by: receiving the set of cost budgets; determining a set of possible paths between the source node and the sink node within the optimized operator graph, wherein each possible path P includes a respective subset of the operator nodes; calculating a respective quality of the path output Q(P) and a respective path cost C(P) for each possible path P of the set of possible paths, wherein the path cost C(P) is calculated using operator cost metrics respectively associated with each of the subset of the operator nodes; and for each cost budget in the set of cost budgets, selecting a subset of the possible paths between the source node and the sink node, wherein each possible path P in the selected subset has a calculated C(P) that is less than or equal to the cost budget; selecting a best path from the subset of possible paths, wherein the best path has the maximum calculated Q(P) of the subset of possible paths; and creating a hashtable entry associating the cost budget with the selected best path. 12. The computer program product of claim 11 , wherein the operations further comprise: orchestrating a data integration task for the data record
Physics · mapped topic
Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.