Answering relational database queries using graph exploration
US-2015120775-A1 · Apr 30, 2015 · US
US10061841B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10061841-B2 |
| Application number | US-201514919183-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 21, 2015 |
| Priority date | Oct 21, 2015 |
| Publication date | Aug 28, 2018 |
| Grant date | Aug 28, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A first plurality of relational tables is obtained from a relational database. Each table of the first plurality of relational tables stores connectivity information for a graph that comprises a plurality of nodes and a plurality of edges connecting the nodes, and each of the nodes is assigned an initial identifier. The nodes are clustered into a plurality of clusters. Each cluster contains a subset of the nodes, and all nodes in each subset are close to each other according to a metric. Each node is assigned a new identifier. The new identifier comprises a concatenation of an identifier associated with the cluster to which the node belongs and an identifier associated with the node. A second plurality of relational tables is constructed and stores connectivity information for the graph. The node is identified in the second plurality of relational tables by the new identifier.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: obtaining a first plurality of relational tables from a relational database, wherein each table of the first plurality of relational tables stores connectivity information for a graph that comprises a plurality of nodes and a plurality of edges connecting the plurality of nodes, wherein each node of the plurality of nodes represents a data item stored in the relational database, each edge represents a relationship between two data items represented by two nodes of the plurality of nodes, and wherein each node of the plurality of nodes is assigned an initial identifier; clustering the plurality of nodes into a plurality of clusters, wherein each cluster of the plurality of clusters contains a subset of the plurality of nodes, and wherein all nodes in each subset of the plurality of nodes are close to each other according to a metric; assigning to each node in the plurality of nodes a new identifier, wherein the new identifier comprises a concatenation of an identifier associated with one of the plurality of clusters to which the each node belongs and an identifier associated with the each node; and constructing a second plurality of relational tables, wherein each table of the second plurality of relational tables stores connectivity information for the graph, and wherein the each node is identified in the second plurality of relational tables by the new identifier, such that a number of data blocks that must be loaded to a computer memory to traverse a path through the graph is minimized by reading index records from a relational table of the second plurality of relational tables. 2. The computer-implemented method of claim 1 , wherein each row of each relational table of the second plurality of relational tables comprises: a new identifier associated with a predecessor node from which an edge of the plurality of edges originates, wherein the a new identifier associated with the predecessor node comprises a concatenation of an identifier associated with one of the plurality of clusters to which the predecessor node belongs and an identifier associated with the predecessor node; and a new identifier associated with a successor node at which the edge of the plurality of edges terminates, wherein the a new identifier associated with the successor node comprises a concatenation of an identifier associated with one of the plurality of clusters to which the successor node belongs and an identifier associated with the successor node. 3. The computer-implemented method of claim 2 , further comprising: creating an index for each relational table of the second plurality of relational tables, wherein the index comprises a concatenation of the new identifier associated with the predecessor node and the new identifier associated with the successor node. 4. The computer-implemented method of claim 3 , wherein the index is organized so that at least some data blocks of index records contain empty space reserved for additions to the index records as a result of updates to the relational database. 5. The computer-implemented method of claim 3 , wherein the relational database and the index are organized so that index records corresponding to those of the plurality of nodes that are within a threshold distance of each other are stored in a common data block in the relational database or in neighboring data blocks in the relational database. 6. The computer-implemented method of claim 3 , wherein the index is sorted such that a given data block in memory contains entries for which associated predecessor nodes belong to a common cluster of the plurality of clusters. 7. The computer-implemented method of claim 2 , wherein the subset of the plurality of nodes includes nodes that are separated from each other by no more than a threshold number of the plurality of edges. 8. The computer-implemented method of claim 2 , wherein each row of each relational table of the second plurality of relational tables further comprises: an edge weight associated with the edge of the plurality of edges. 9. The computer-implemented method of claim 8 , further comprising: creating an index for the relational table of the second plurality of relational tables, wherein each index comprises a concatenation of the new identifier associated with the predecessor node, the new identifier associated with the successor node, and the edge weight. 10. The computer-implemented method of claim 9 , wherein the index is sorted such that a given data block in the relational database contains entries for which associated predecessor nodes belong to a common cluster of the plurality of clusters. 11. The computer-implemented method of claim 8 , wherein the subset of the plurality of nodes includes nodes that are separated from each other by no more than a threshold sum of edge weights. 12. The computer-implemented method of claim 8 , wherein the weight reflects a length of the edge of the plurality of edges. 13. The computer-implemented method of claim 1 , wherein the identifier associated with the each node is unique across all of the plurality of nodes in the relational database. 14. The computer-implemented method of claim 1 , wherein the identifier associated with the each node is unique across all of the subset of the plurality of nodes contained in the one of the plurality of clusters to which the each node belongs. 15. A machine-readable storage medium encoded with instructions executable by a processor, wherein the instructions cause the processor to perform operations comprising: obtaining a first plurality of relational tables from a relational database, wherein each table of the first plurality of relational tables stores connectivity information for a graph that comprises a plurality of nodes and a plurality of edges connecting the plurality of nodes, wherein each node of the plurality of nodes represents a data item stored in the relational database, each edge represents a relationship between two data items represented by two nodes of the plurality of nodes, and wherein each node of the plurality of nodes is assigned an initial identifier; clustering the plurality of nodes into a plurality of clusters, wherein each cluster of the plurality of clusters contains a subset of the plurality of nodes, and wherein all nodes in each subset of the plurality of nodes are close to each other according to a metric; assigning to each node in the plurality of nodes a new identifier, wherein the new identifier comprises a concatenation of an identifier associated with one of the plurality of clusters to which the each node belongs and an identifier associated with the each node; and constructing a second plurality of relational tables, wherein each table of the second plurality of relational tables stores connectivity information for the graph, and wherein the each node is identified in the second plurality of relational tables by the new identifier, such that a number of data blocks that must be loaded to a computer memory to traverse a path through the graph is minimized by reading index records from a relational table of the second plurality of relational tables. 16. A computer-implemented method, comprising: obtaining a first set of nodes of a graph, wherein each node in the first set of nodes represents a data item stored in a relational database, each edge of the graph represents a relationship between two data items represented by two nodes of the first set in nodes, and wherein each node in the first set of nodes is identified by an initial identifier; for each node in the first se
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Indexing structures · CPC title
Clustering or classification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.