Methods and systems for transforming distributed database structure for reduced compute load
US-2024330289-A1 · Oct 3, 2024 · US
US10242052B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10242052-B2 |
| Application number | US-201213658034-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 23, 2012 |
| Priority date | Jul 24, 2012 |
| Publication date | Mar 26, 2019 |
| Grant date | Mar 26, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for processing a database query are disclosed. An example method includes receiving a SQL database query at a database query handling server, and parsing the SQL database query to identify a database and one or more tables and columns identified by the SQL database query. The method also includes determining a query plan based on the parsed database query. At a database engine, and based on the query plan and the identified database, tables and columns, the method further includes identifying a set of data nodes implicated by the identified database, tables and columns, determining a set of reduce operations and levels at which each of the set of map-reduce operations are to execute, and passing the query plan, the set of data nodes, and the map-reduce operations to a map-reduce query execution framework. The map-reduce query framework returns records as query results to the client system.
Opening claim text (preview).
The invention claimed is: 1. A method of processing a database query, the method comprising: receiving a SQL database query at a database query handling server managing access to a database; parsing, by the database query handling server, the SQL database query to identify one or more tables and columns identified by the SQL database query; determining, by the database query handling server, a query plan based on the parsed database query; and at a database engine running on the database query handling server, based on the query plan, and the identified tables and columns: identifying, by the database query handling server, a set of data nodes implicated by the database and the identified one or more tables and columns, determining, by the database query handling server, based on the identifying, a set of map-reduce operations and levels at which each of the set of map-reduce operations are to execute; and passing, by the database query handling server, the query plan, the set of data nodes, and the map-reduce operations to a map-reduce query execution framework running on the database query handling server, wherein the set of map-reduce operations correspond to an atomic set of operations that are performed at a data block level, wherein the map-reduce query execution framework is configured to distribute each of the map-reduced operations of the parsed query to one or more data nodes communicatively connected to the database query handling server by referencing IP addresses of the one or more data nodes having relevant data, and to receive data from the one or more data nodes in response to at least one of the map-reduced operations, wherein during map-reduced operations each data node of the one or more data nodes access different blocks of the data without sitting idle permitting each of the one or more data nodes to execute at a same time, wherein the database engine and the map-reduce query execution framework are part of one component running on the database query handling server; and wherein the one or more data nodes comprise a plurality of data nodes having a plurality of tables and indices distributed thereamong. 2. The method of claim 1 , wherein identifying the set of data nodes includes determining, based on the database and the identified tables and columns, one or more data nodes containing data responsive to the SQL database query. 3. The method of claim 2 , further comprising maintaining at the database engine a tree model for the database, the tree model associating one or more data nodes with each block of data stored in the database, one or more blocks of data with each table included in the database, and one or more tables with the database. 4. The method of claim 1 , wherein the query plan comprises a set of operations and an execution sequence of the set of operations used to perform the SQL database query. 5. The method of claim 1 , further comprising, at the map-reduce query execution framework, executing one or more of the set of map-reduce operations on the data received from the one or more data nodes. 6. The method of claim 1 , further comprising, at the map-reduce query execution framework, distributing one or more of the set of map-reduce operations to a data node for execution prior to receiving the data from that data node. 7. The method of claim 6 , wherein, upon determining that the one or more reduce operations is to be performed on responsive data from a plurality of data nodes, transferring the responsive data to the data node prior to performing a final reduce operation. 8. The method of claim 7 , wherein the database engine designates the data node on which the final reduce operation is to take place. 9. The method of claim 1 , wherein the one or more data nodes includes a first data node assigned a first map-reduce operation and a second data node assigned a second map-reduce operation, and wherein the first and second map-reduce operations are executed in parallel. 10. The method of claim 1 , further comprising opening the database prior to passing the query plan, the set of data nodes, and the map-reduce operations to a map-reduce query execution framework. 11. The method of claim 1 , further comprising: at the map-reduce query execution framework: distributing each of the map-reduce operations to a plurality of data nodes communicatively connected to the database query handling server; performing a first map-reduce operation at a first data node of the plurality of data nodes, thereby receiving a first result set including one or more records; and performing a second map-reduce operation at a second data node, wherein the one or more records are applied as keys at the second data node, and wherein the second map-reduce operation is directed by the keys. 12. The method of claim 11 , further comprising receiving data from the second data node representing an intersection of records responsive to the query at the first and second data nodes. 13. A computer storage medium comprising computer-executable instructions which, when executed on a computing system, cause the computing system to perform a method of processing a data query, the method comprising: receiving, by a server, a SQL database query; parsing, by the server, the SQL database query to identify a database and one or more tables and columns identified by the SQL database query; determining by the server, a query plan based on the parsed database query; and based on the query plan and the identified database, tables and columns: identifying, by the server, a set of data nodes implicated by the identified database, tables and columns; determining, by the server, based on the identifying, a set of map-reduce operations and levels at which each of the set of map-reduce operations are to execute; and passing, by the server, the query plan, the set of data nodes, and the map-reduce operations to a map-reduce query execution framework running on the server, wherein the set of map-reduce operations correspond to a set of atomic set operations that are performed at a data block level, wherein map-reduced query execution framework is configured to distribute each of the map-reduced operations of the parsed query to one or more data nodes communicatively connected to the server by referencing IP addresses of the one or more data nodes having relevant data, and to receive data from the one or more data nodes in response to at least one of the map-reduce operations, wherein during map-reduced operations each data node of the one or more data nodes access different blocks of the data without sitting idle permitting each of the one or more data nodes to execute at a same time, wherein the identifying, the determining the set of map-reduce operations and levels at which each of the set of map-reduce operations are to execute, and the passing are performed by one component running on the server, wherein the one component comprises the map-reduce query execution framework; and wherein the one or more data nodes comprise a plurality of data nodes having a plurality of tables and indices distributed thereamong. 14. The computer storage medium of claim 13 , wherein the method further comprises maintaining a tree model for the database, the tree model associating one or more data nodes with each block of data stored in the database, one or more blocks of data with each table included in the database, and one or more tables with the database. 15. A database query handling system comprising: a plurality of data nodes; a database query handling server communicatively connected to each of the plurality of data nodes,
Plan optimisation · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.