Asynchronous message passing for large graph clustering

US9852230B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9852230-B2
Application numberUS-201314145127-A
CountryUS
Kind codeB2
Filing dateDec 31, 2013
Priority dateJun 29, 2013
Publication dateDec 26, 2017
Grant dateDec 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for sending asynchronous messages include receiving, using at least one processor, at a node in a distributed graph, a message with a first value and determining, at the node, that the first value replaces a current value for the node. In response to determining that the first value replaces the current value, the method also includes setting a status of the node to active and sending messages including the first value to neighboring nodes. The method may also include receiving the messages to the neighboring nodes at a priority queue. The priority queue propagates messages in an intelligently asynchronous manner, and the priority queue propagates the messages to the neighboring nodes, the status of the node is set to inactive. The first value may be a cluster identifier or a shortest path identifier.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: distributed computing devices represented by leaf servers; and memory storing a graph of nodes and edges, the graph being distributed across the leaf servers, wherein a leaf server includes: a priority queue engine that propagates messages between neighboring nodes in an intelligent manner that includes bundling together messages directed to nodes on another leaf server before the propagation of the messages and skipping redundant messages rather than propagating the redundant messages, memory storing a cluster identifier for each node assigned to the leaf server, at least one processor, and memory storing instructions that, when executed by the at least one processor, cause the leaf server to send asynchronous messages between neighboring nodes via the priority queue engine, the messages including a cluster identifier for a first node, wherein sending the asynchronous message is triggered when the first node updates its cluster identifier after receiving the cluster identifier from another node. 2. The system of claim 1 wherein the memory further stores instructions that, when executed by the at least one processor, cause the leaf server to: propagate the message to a second node; compare the cluster identifier from the message with the cluster identifier for the second node to determine whether to update the cluster identifier for the second node; and when it is determined that the cluster identifier of the second node is to be updated: update the cluster identifier for the second node with the cluster identifier from the message, and generate messages to neighboring nodes of the second node, the messages including the updated cluster identifier. 3. The system of claim 2 wherein memory further stores instructions that, when executed by the at least one processor, cause the leaf server to: set a status of the second node to active as part of the updating. 4. The system of claim 3 wherein the memory further stores instructions that, when executed by the at least one processor, cause the leaf server to: set the status of the second node to inactive in response to propagation of the messages to neighboring nodes. 5. The system of claim 2 wherein the memory further stores instructions that, when executed by the at least one processor, cause the leaf server to: store the updated cluster identifier in persistent memory. 6. The system of claim 1 , wherein an initial value for the cluster identifier of a node is the node identifier. 7. The system of claim 1 , wherein redundant messages include messages having a cluster identifier that would be updated by a cluster identifier in another message. 8. The system of claim 2 wherein the cluster identifier represents the smallest identifier seen by the first node and the cluster identifier for the second node is to be updated when the cluster identifier from the message is smaller than the cluster identifier for the second node. 9. The system of claim 1 wherein the graph includes more than one billion nodes. 10. The system of claim 1 , wherein at least one leaf server includes a plurality of processors and the at least one leaf server uses the plurality of processors to concurrently send multiple messages departing from nodes and to receive multiple messages arriving at nodes. 11. A computer-implemented method comprising: propagating, using at least one processor, messages sent between nodes in a distributed graph in an intelligently asynchronous manner that includes bundling together messages directed to nodes on another leaf server before the propagation of the messages and skipping redundant messages rather than propagating the redundant messages, the messages including respective cluster identifiers, wherein a priority queue engine controls the propagating; and in response to a first node of the distributed graph receiving one of the messages: comparing, using the at least one processor, a cluster identifier from the received message with a cluster identifier for the first node to determine whether to update the cluster identifier for the first node, and when it is determined that the cluster identifier of the first node is to be updated: updating the cluster identifier for the first node with the cluster identifier from the message, and sending messages to neighboring nodes of the first node via the priority queue, the messages including the updated cluster identifier. 12. The method of claim 11 wherein redundant messages include messages having a cluster identifier that would be updated by a cluster identifier in another message. 13. The method of claim 11 wherein the priority queue engine propagates the messages in an arbitrary manner rather than a first-in-first-out or last-in-last-out manner. 14. The method of claim 11 wherein nodes in the distributed graph are assigned to one of a plurality of leaf servers with each leaf server having a respective priority queue engine, the priority queue engine being one of the respective priority queue engines, and wherein the priority queue engine bundles messages directed to nodes assigned to a remote leaf server of the plurality of leaf servers prior to propagating the messages. 15. The method of claim 14 wherein propagating the messages at a particular leaf server continues despite a failure of another leaf server. 16. The method of claim 11 further comprising storing the updated cluster identifier in persistent memory. 17. The method of claim 16 , further comprising, in response to the first node determining that it experienced a failure: obtaining the cluster identifier from the persistent memory; and generating messages to the neighboring nodes of the first node, the messages requesting a cluster identifier from the respective neighboring nodes. 18. The method of claim 17 wherein the cluster identifier for the first node represents the smallest identifier seen by the first node and the cluster identifier for the first node is to be updated when the cluster identifier from the message is smaller than the cluster identifier for the first node. 19. The method of claim 11 further comprising: setting a status of the first node to active as part of the updating. 20. The method of claim 19 further comprising: setting the status of the first node to inactive in response to propagation of the messages to the neighboring nodes. 21. The method of claim 11 , wherein the propagating is performed without regard to respective states of the nodes in the distributed graph. 22. A computer-implemented method comprising: receiving, using at least one processor, at a node in a distributed graph, a message with a first value; determining, at the node, that the first value replaces a current value for the node; responsive to the determining, setting a status of the node to active and sending messages that include the first value to neighboring nodes; and receiving, using the at least one processor, the messages to the neighboring nodes at a priority queue, wherein the priority queue propagates messages in an intelligently asynchronous manner that includes bundling together messages directed to nodes on another leaf server before the propagation of the messages and skipping redundant messages rather than propagating the redundant messages, and wherein responsive to the message being propagated to the neighboring nodes, the status of the node is set to inactive. 23. The method of claim 22 , w

Assignees

Inventors

Classifications

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Message passing systems or structures, e.g. queues · CPC title

  • Queue · CPC title

  • Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9852230B2 cover?
Systems and methods for sending asynchronous messages include receiving, using at least one processor, at a node in a distributed graph, a message with a first value and determining, at the node, that the first value replaces a current value for the node. In response to determining that the first value replaces the current value, the method also includes setting a status of the node to active a…
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/9024. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).