Method and system for distributed processing in a messaging platform
US-9858130-B2 · Jan 2, 2018 · US
US10037376B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10037376-B2 |
| Application number | US-201615096067-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 11, 2016 |
| Priority date | Mar 11, 2016 |
| Publication date | Jul 31, 2018 |
| Grant date | Jul 31, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed embodiments provide a system for processing data. During operation, the system determines a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters. Next, the system uses the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters. The system then selects a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components. Finally, the system transmits the query to one or more of the storage nodes in the selected number of clusters.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: determining a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; using the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; and when a query of the graph database is received, processing the query on a computer system by: selecting a number of clusters in the multiple clusters for fanning out of the query, based on the expected QPS and one or more throughput limits for the one or more components; and transmitting the query to one or more storage nodes in the selected number of clusters. 2. The method of claim 1 , wherein the one or more components comprise: a client node that receives incoming queries to the graph database; and a storage node in the set of storage nodes. 3. The method of claim 2 , wherein using the current incoming QPS to estimate the expected QPS for the client node comprises: multiplying the current incoming QPS to the client node by a candidate number of clusters for fanning out of the query and a number of storage nodes in each of the clusters to obtain an expected outgoing QPS from the client node. 4. The method of claim 3 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the client node and the candidate number of clusters divided by the expected outgoing QPS from the client node. 5. The method of claim 2 , wherein using the current incoming QPS to estimate the expected QPS for the storage node comprises: multiplying the current incoming QPS to the client node by a number of instances of the client node to obtain an expected incoming QPS to the storage node. 6. The method of claim 5 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the storage node and a candidate number of clusters for fanning out of the query divided by a product of the number of instances of the client node and the current incoming QPS to the client node. 7. The method of claim 2 , wherein the client node comprises at least one of: a query processor; and a caching service. 8. The method of claim 1 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: limiting the selected number of clusters to a value that does not cause the expected QPS to exceed the one or more throughput limits. 9. The method of claim 1 , further comprising: selecting the number of clusters for fanning out of the query based on a query type of the query. 10. The method of claim 1 , wherein the graph database comprises a set of partitions in a first distribution across the storage nodes in a first cluster and a second distribution that is different from the first distribution across the storage nodes in a second cluster. 11. The method of claim 1 , wherein transmitting the query to the one or more storage nodes in the selected number of clusters comprises: randomly selecting the determined number of clusters as a subset of the multiple clusters; identifying a partition of the graph database storing data associated with a key in the query; selecting a cluster from the subset of clusters; identifying a storage node containing the partition in the selected cluster; and transmitting a portion of the query comprising the key to the identified storage node. 12. An apparatus, comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to: determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components; and transmit the query to one or more of the storage nodes in the selected number of clusters. 13. The apparatus of claim 12 , wherein the one or more components comprise: a client node that receives incoming queries to the graph database; and a storage node in the set of storage nodes. 14. The apparatus of claim 13 , wherein using the current incoming QPS to estimate the expected QPS for the client node comprises: multiplying the current incoming QPS to the client node by a candidate number of clusters for fanning out of the query and a number of storage nodes in each of the clusters to obtain an expected outgoing QPS from the client node. 15. The apparatus of claim 14 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the client node and the candidate number of clusters divided by the expected outgoing QPS from the client node. 16. The apparatus of claim 13 , wherein using the current incoming QPS to estimate the expected QPS for the storage node comprises: multiplying the current incoming QPS to the client node by a number of instances of the client node to obtain an expected incoming QPS to the storage node. 17. The apparatus of claim 16 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the storage node and a candidate number of clusters for fanning out of the query divided by a product of the number of instances of the client node and the current incoming QPS to the client node. 18. The apparatus of claim 12 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: limiting the selected number of clusters to a value that does not cause the expected QPS to exceed the one or more throughput limits. 19. A system, comprising: a measurement mechanism comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; and a client node comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to: use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput
Cluster building · CPC title
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.