Finding optimal query plans

US9785673B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9785673-B2
Application numberUS-201615196237-A
CountryUS
Kind codeB2
Filing dateJun 29, 2016
Priority dateNov 25, 2013
Publication dateOct 10, 2017
Grant dateOct 10, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for optimizing a query, and more particularly, systems and methods for finding optimal plans for graph queries by casting the task of finding the optimal plan as an integer programming (ILP) problem. A method for optimizing a query, comprises building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern, determining a plurality of flows of query variables between the plurality of components, and determining a combination of the plurality of flows between the plurality of components that results in a minimum cost to execute the query.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for optimizing a query, comprising: a memory, and at least one processor operatively coupled to the memory; a construction module executed via the at least one processor and capable of building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern; a flow module executed via the at least one processor and capable of determining a plurality of flows of query variables between the plurality of components; a constraint module executed via the at least one processor and capable of generating one or more constraints to dynamically eliminate invalid flows from the plurality of flows of query variables, wherein a flow is determined to be invalid if the flow would violate semantics of one or more control statements in the query; wherein the one or more constraints are expressed as a function of decision variables and comprise one or more of at least one component constraint enforcing semantics of an external view of the plurality of components, at least one graph constraint enforcing semantics of the plurality of flows of query variables, and at least one predecessor constraint enforcing semantics of one or more potential predecessors; and a cost determination and function module executed via the at least one processor and capable of: formulating a cost function associated with the plurality of flows; and outputting a query plan based on the cost function, wherein outputting the query plan comprises determining a combination of valid flows that results in a minimum cost under the one or more constraints. 2. The system of claim 1 , wherein the query is composed of a set of hierarchically nested graph patterns, and wherein the query is flattened to eliminate unnecessary syntactic nesting. 3. The system of claim 1 , wherein the query plan is output as a solution to an optimization problem, and the system further comprises a solver to determine the solution to the optimization problem. 4. The system of claim 3 , wherein the optimization problem is a linear optimization problem, and wherein the solver is an integral linear programming (ILP) solver. 5. The system of claim 1 , wherein the graph pattern comprises at least one of a triple pattern and an operation on more than one triple pattern. 6. The system of claim 1 , wherein a first component of the plurality of components comprises at least one proxy component representing a second component of the plurality of components that is connected to a query variable input of the first component. 7. The system of claim 1 , wherein the plurality of components represent respective access methods for the graph pattern, and wherein each access method is associated with a respective cost. 8. A method for optimizing a query, comprising: building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern; determining a plurality of flows of query variables between the plurality of components; generating one or more constraints to dynamically eliminate invalid flows from the plurality of flows of query variables, wherein a flow is determined to be invalid if the flow would violate semantics of one or more control statements in the query; wherein the one or more constraints are expressed as a function of decision variables and comprise one or more of at least one component constraint enforcing semantics of an external view of the plurality of components, at least one graph constraint enforcing semantics of the plurality of flows of query variables, and at least one predecessor constraint enforcing semantics of one or more potential predecessors; formulating a cost function associated with the plurality of flows; and outputting a query plan based on the cost function, wherein outputting the query plan comprises determining a combination of the plurality of flows that results in a minimum cost under the one or more constraints. 9. The method of claim 8 , wherein the query is composed of a set of hierarchically nested graph patterns, and wherein the query is flattened to eliminate unnecessary syntactic nesting. 10. The method of claim 8 , wherein the query plan is output as a solution to an optimization problem, and the method further comprises using a solver to determine the solution to the optimization problem. 11. The method of claim 10 , wherein the optimization problem is a linear optimization problem, and wherein the solver is an integral linear programming (ILP) solver. 12. The method of claim 8 , wherein the graph pattern comprises at least one of a triple pattern and an operation on more than one triple pattern. 13. The method of claim 8 , wherein a component of the plurality of components represents an access method for the graph pattern. 14. The method of claim 8 , wherein a first component of the plurality of components comprises at least one proxy component representing a second component of the plurality of components that is connected to a query variable input of the first component. 15. The method of claim 8 , wherein the plurality of components represent respective access methods for the graph pattern, and wherein each access method is associated with a respective cost. 16. The method of claim 8 , wherein determining the combination of the plurality of flows that results in the minimum cost is performed within a predetermined time limit. 17. The method of claim 8 , wherein the one or more constraints further comprise at least one output pin constraint controlling an activation of output pins of the plurality of components. 18. The method of claim 8 , wherein the one or more constraints further comprise at least one component-type specific constraint applicable to components of a specific type. 19. An article of manufacture comprising a non-transitory computer readable storage medium comprising program code tangibly embodied thereon, which when executed by a computer, performs method steps for optimizing a query, the method steps comprising: building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of components corresponds to at least one graph pattern; determining a plurality of flows of query variables between the plurality of components; generating one or more constraints to dynamically eliminate invalid flows from the plurality of flows of query variables, wherein a flow is determined to be invalid if the flow would violate semantics of one or more control statements in the query; wherein the one or more constraints are expressed as a function of decision variables and comprise one or more of at least one component constraint enforcing semantics of an external view of the plurality of components, at least one graph constraint enforcing semantics of the plurality of flows of query variables, and at least one predecessor constraint enforcing semantics of one or more potential predecessors; formulating a cost function associated with the plurality of flows; and outputting a query plan based on the cost function, wherein outputting the query plan comprises determining comprises determining a combination of the plurality of flows that results in a minimum cost under the one or more constraints. 20. The article of manufacture of claim 19 , wherein the query is composed of a set of hierarchically nested graph patterns, and wherein the query is f

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9785673B2 cover?
Systems and methods for optimizing a query, and more particularly, systems and methods for finding optimal plans for graph queries by casting the task of finding the optimal plan as an integer programming (ILP) problem. A method for optimizing a query, comprises building a data structure for a query, the data structure including a plurality of components, wherein each of the plurality of compon…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30469. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 10 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).