Who is the assignee on this patent?

Linkedln Corp, Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F17/30598. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 31 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Throughput-based fan-out control in scalable distributed data stores

US10037376B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10037376-B2
Application number	US-201615096067-A
Country	US
Kind code	B2
Filing date	Apr 11, 2016
Priority date	Mar 11, 2016
Publication date	Jul 31, 2018
Grant date	Jul 31, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments provide a system for processing data. During operation, the system determines a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters. Next, the system uses the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters. The system then selects a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components. Finally, the system transmits the query to one or more of the storage nodes in the selected number of clusters.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: determining a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; using the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; and when a query of the graph database is received, processing the query on a computer system by: selecting a number of clusters in the multiple clusters for fanning out of the query, based on the expected QPS and one or more throughput limits for the one or more components; and transmitting the query to one or more storage nodes in the selected number of clusters. 2. The method of claim 1 , wherein the one or more components comprise: a client node that receives incoming queries to the graph database; and a storage node in the set of storage nodes. 3. The method of claim 2 , wherein using the current incoming QPS to estimate the expected QPS for the client node comprises: multiplying the current incoming QPS to the client node by a candidate number of clusters for fanning out of the query and a number of storage nodes in each of the clusters to obtain an expected outgoing QPS from the client node. 4. The method of claim 3 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the client node and the candidate number of clusters divided by the expected outgoing QPS from the client node. 5. The method of claim 2 , wherein using the current incoming QPS to estimate the expected QPS for the storage node comprises: multiplying the current incoming QPS to the client node by a number of instances of the client node to obtain an expected incoming QPS to the storage node. 6. The method of claim 5 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the storage node and a candidate number of clusters for fanning out of the query divided by a product of the number of instances of the client node and the current incoming QPS to the client node. 7. The method of claim 2 , wherein the client node comprises at least one of: a query processor; and a caching service. 8. The method of claim 1 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: limiting the selected number of clusters to a value that does not cause the expected QPS to exceed the one or more throughput limits. 9. The method of claim 1 , further comprising: selecting the number of clusters for fanning out of the query based on a query type of the query. 10. The method of claim 1 , wherein the graph database comprises a set of partitions in a first distribution across the storage nodes in a first cluster and a second distribution that is different from the first distribution across the storage nodes in a second cluster. 11. The method of claim 1 , wherein transmitting the query to the one or more storage nodes in the selected number of clusters comprises: randomly selecting the determined number of clusters as a subset of the multiple clusters; identifying a partition of the graph database storing data associated with a key in the query; selecting a cluster from the subset of clusters; identifying a storage node containing the partition in the selected cluster; and transmitting a portion of the query comprising the key to the identified storage node. 12. An apparatus, comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to: determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput limits for the one or more components; and transmit the query to one or more of the storage nodes in the selected number of clusters. 13. The apparatus of claim 12 , wherein the one or more components comprise: a client node that receives incoming queries to the graph database; and a storage node in the set of storage nodes. 14. The apparatus of claim 13 , wherein using the current incoming QPS to estimate the expected QPS for the client node comprises: multiplying the current incoming QPS to the client node by a candidate number of clusters for fanning out of the query and a number of storage nodes in each of the clusters to obtain an expected outgoing QPS from the client node. 15. The apparatus of claim 14 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the client node and the candidate number of clusters divided by the expected outgoing QPS from the client node. 16. The apparatus of claim 13 , wherein using the current incoming QPS to estimate the expected QPS for the storage node comprises: multiplying the current incoming QPS to the client node by a number of instances of the client node to obtain an expected incoming QPS to the storage node. 17. The apparatus of claim 16 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: selecting the number of clusters to not exceed a product of a throughput limit of the storage node and a candidate number of clusters for fanning out of the query divided by a product of the number of instances of the client node and the current incoming QPS to the client node. 18. The apparatus of claim 12 , wherein selecting the number of clusters for fanning out of the query based on the expected QPS and the one or more throughput limits comprises: limiting the selected number of clusters to a value that does not cause the expected QPS to exceed the one or more throughput limits. 19. A system, comprising: a measurement mechanism comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to determine a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters; and a client node comprising a non-transitory computer-readable medium comprising instructions that, when executed, cause the system to: use the current incoming QPS to estimate, for the one or more components, an expected QPS associated with fanning out of the queries to the clusters; select a number of clusters in the multiple clusters for fanning out of a query based on the expected QPS and one or more throughput

Assignees

Inventors

Classifications

H04L45/46
Cluster building · CPC title
G06F17/30584
Physics · mapped topic
G06F17/30867
Physics · mapped topic
G06F17/30575
Physics · mapped topic
G06F17/30598Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 59786755

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10037376B2 cover?: The disclosed embodiments provide a system for processing data. During operation, the system determines a current incoming queries per second (QPS) to one or more components for processing queries of a graph database, wherein the graph database is replicated across multiple clusters and distributed among a set of storage nodes in each of the clusters. Next, the system uses the current incoming …
Who is the assignee on this patent?: Linkedln Corp, Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F17/30598. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 31 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Method and system for distributed processing in a messaging platform

System and method for searching a distributed node-sharded graph

Query Construction on Online Social Networks

Systems and methods for mapping and routing based on clustering

Frequently asked questions