Performance-adaptive sampling strategy towards fast and accurate graph neural networks

US2023049817A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023049817-A1
Application numberUS-202117399797-A
CountryUS
Kind codeA1
Filing dateAug 11, 2021
Priority dateAug 11, 2021
Publication dateFeb 16, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for implementing a performance-adaptive sampling strategy towards fast and accurate graph neural networks are provided. In one technique, a graph that comprises multiple nodes and edges connecting the nodes is stored. An embedding for each node is initialized, as well as a sampling policy for sampling neighbors of nodes. One or more machine learning techniques are used to train a graph neural network and learn embeddings for the nodes. Using the one or more machine learning techniques comprises, for each node: (1) selecting, based on the sampling policy, a set of neighbors of the node; (2) based on the graph neural network and embeddings for the node and the set of neighbors, computing a performance loss; and (3) based on a gradient of the performance loss, modifying the sampling policy.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: storing a graph that comprises a plurality of nodes and a plurality of edges connecting the plurality of nodes; initializing an embedding for each node in the plurality of nodes; initializing a sampling policy for sampling neighbors of nodes; using one or more machine learning techniques to train a graph neural network and learn a plurality of embeddings for the plurality of nodes, wherein using the one or more machine-learning techniques comprises: for each node: selecting, based on the sampling policy, a set of neighbors of said each node; based on the graph neural network and embeddings for said each node and the set of neighbors, computing a performance loss; based on a gradient of the performance loss, modifying the sampling policy; wherein the method is performed by one or more computing devices. 2 . The method of claim 1 , wherein the sampling policy comprises a sampling operation that includes an importance sampling component that computes, for each neighbor in the set of neighbor of said each node, a product that is based on a transformation matrix, an embedding of said each node, and an embedding of said each neighbor. 3 . The method of claim 2 , wherein the product is of (1) a first product of the transformation matrix and the embedding of said each node and (2) a second product of the transformation matrix and the embedding of said each neighbor. 4 . The method of claim 2 , wherein the sampling operation includes: a random sampling component that computes a probability of selecting each neighbor in the set of neighbors; a learnable attention vector that regulates a tradeoff between the importance sampling component and the random sampling component 5 . The method of claim 1 , further comprising: for each neighbor of a plurality of neighbors of said each node, computing, based on the sampling policy, a selection probability that indicates a probability of selecting said each neighbor, wherein the selection probability is based on the embedding of said each node and the embedding of said each neighbor; wherein selecting the set of neighbors comprises selecting the set of neighbors from among the plurality of neighbors based on the selection probability of said each neighbor in the plurality of neighbors. 6 . The method of claim 1 , wherein: modifying the sampling policy comprises performing a logarithm operation to update a transformation matrix of the sampling policy using the gradient of the performance loss, wherein the sampling policy comprises a non-differentiable function. 7 . The method of claim 1 , wherein: the graph neural network is a first graph neural network that is trained for a first task; training the first graph neural network results in a first plurality of embeddings for the plurality of nodes; the method further comprising training a second graph neural network for a second task that is different than the first task; wherein training the second graph neural network results in a second plurality of embeddings for the plurality of nodes. 8 . The method of claim 7 , further comprising: retrieving, by a first embedding consumer, from the first plurality of embeddings, a first embedding that corresponds to a first node in the plurality of nodes; determining, by the first embedding consumer, based on the first embedding, whether to perform a first operation with respect to an entity that corresponds to the first node; retrieving, by a second embedding consumer that is different than the first embedding consumer, from the second plurality of embeddings, a second embedding that corresponds to the first node in the plurality of nodes; determining, by the second embedding consumer, based on the second embedding, whether to perform a second operation with respect to the entity that corresponds to the first node. 9 . The method of claim 1 , wherein the graph neural network is a graph convolutional network. 10 . The method of claim 1 , further comprising: determining, for a particular user, an embedding that is based on one or more embeddings of the plurality of embeddings; generating an output based on inputting the embedding into the graph neural network; based on the output: associating a class, from among a plurality of classes, with the particular user, identifying data about a plurality of other users to present to the particular user, or identifying an entity to present to the particular user as a recommendation. 11 . One or more storage media storing instructions which, when executed by one or more processors, cause: storing a graph that comprises a plurality of nodes and a plurality of edges connecting the plurality of nodes; initializing an embedding for each node in the plurality of nodes; initializing a sampling policy for sampling neighbors of nodes; using one or more machine learning techniques to train a graph neural network and learn a plurality of embeddings for the plurality of nodes, wherein using the one or more machine-learning techniques comprises: for each node: selecting, based on the sampling policy, a set of neighbors of said each node; based on the graph neural network and embeddings for said each node and the set of neighbors, computing a performance loss; based on a gradient of the performance loss, modifying the sampling policy. 12 . The one or more storage media of claim 11 , wherein the sampling policy comprises a sampling operation that includes an importance sampling component that computes, for each neighbor in the set of neighbor of said each node, a product that is based on a transformation matrix, an embedding of said each node, and an embedding of said each neighbor. 13 . The one or more storage media of claim 12 , wherein the product is of (1) a first product of the transformation matrix and the embedding of said each node and (2) a second product of the transformation matrix and the embedding of said each neighbor. 14 . The one or more storage media of claim 12 , wherein the sampling operation includes: a random sampling component that computes a probability of selecting each neighbor in the set of neighbors; a learnable attention vector that regulates a tradeoff between the importance sampling component and the random sampling component 15 . The one or more storage media of claim 11 , wherein the instructions, when executed by the one or more processors, further cause: for each neighbor of a plurality of neighbors of said each node, computing, based on the sampling policy, a selection probability that indicates a probability of selecting said each neighbor, wherein the selection probability is based on the embedding of said each node and the embedding of said each neighbor; wherein selecting the set of neighbors comprises selecting the set of neighbors from among the plurality of neighbors based on the selection probability of said each neighbor in the plurality of neighbors. 16 . The one or more storage media of claim 11 , wherein: modifying the sampling policy comprises performing a logarithm operation to update a transformation matrix of the sampling policy using the gradient of the performance loss, wherein the sampling policy comprises a non-differentiable function. 17 . The one or more storage media of claim 11 , wherein: the graph neural network is a first graph neural network that is trained for a first task; training the first graph neural network results in a first plurality of embeddings for the plurality of nodes; wherein the instructions, when executed by t

Assignees

Inventors

Classifications

  • G06F18/29Primary

    Graphical models, e.g. Bayesian networks · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Multiple classes · CPC title

  • Selection of the most significant subset of features · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023049817A1 cover?
Techniques for implementing a performance-adaptive sampling strategy towards fast and accurate graph neural networks are provided. In one technique, a graph that comprises multiple nodes and edges connecting the nodes is stored. An embedding for each node is initialized, as well as a sampling policy for sampling neighbors of nodes. One or more machine learning techniques are used to train a gra…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F18/29. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Feb 16 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).