Generating neighborhood convolutions within a large network

US11922308B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11922308-B2
Application numberUS-202217577187-A
CountryUS
Kind codeB2
Filing dateJan 17, 2022
Priority dateMar 13, 2018
Publication dateMar 5, 2024
Grant dateMar 5, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for generating embeddings for nodes of a corpus graph are presented. More particularly, operations for generation of an aggregated embedding vector for a target node is efficiently divided among operations on a central processing unit and operations on a graphic processing unit. With regard to a target node within a corpus graph, processing by one or more central processing units (CPUs) is conducted to identify the target node's relevant neighborhood (of nodes) within the corpus graph. This information is prepared and passed to one or more graphic processing units (GPUs) that determines the aggregated embedding vector for the target node according to data of the relevant neighborhood of the target node.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system, comprising: one or more processors; and a memory storing program instructions that, when executed by the one or more processors, cause the one or more processors to at least: determine a target node from a plurality of nodes associated with a corpus graph, wherein each node of the plurality of nodes is associated with a respective content item and a respective embedding vector that is representative of the respective content item and is generated by a deep neural network; iteratively perform for a predetermined number of iterations, starting with the target node as a current node: randomly select a neighbor node from the plurality of nodes to which the current node is connected in the corpus graph; increment a count associated with the neighbor node, the count representing a random traversal to the neighbor node; and determine that the current node includes the neighbor node or the target node; determine a plurality of relevant neighbor nodes from the neighbor nodes based on the neighbor nodes having highest counts; and in response to a determination that a query corresponds to the target node, provide at least one item of content corresponding to at least one node of the corpus graph as responsive to the query, the at least one node determined based at least in part on information associated with the plurality of relevant neighbor nodes. 2. The computing system of claim 1 , wherein the determination that the current node includes the neighbor node or the target node is randomly determined. 3. The computing system of claim 1 , wherein the program instructions that, when executed by the one or more processors, further cause the one or more processors to at least: generate an aggregated embedding vector for each of the plurality of relevant neighbor nodes based at least in part on the respective embedding vectors associated with each of the plurality of relevant neighbor nodes; aggregate the aggregated embedding vectors for each of the plurality of relevant neighbor nodes to generate aggregated neighborhood embedding information associated with the plurality of relevant neighbor nodes; and associate the aggregated neighborhood embedding information with the target node. 4. The computing system of claim 3 , wherein the program instructions that, when executed by the one or more processors, further cause the one or more processors to at least: access the respective embedding vector associated with the target node; concatenate the respective embedding vector associated with the target node with the aggregated neighborhood embedding information to generate a target node aggregated embedding vector for the target node; and associate the target node aggregated embedding vector with the target node. 5. The computing system of claim 3 , wherein: aggregating the aggregated embedding vectors for each of the plurality of relevant neighbor nodes to generate the aggregated neighborhood embedding information includes determining an importance weighting for each of the plurality of relevant neighbor nodes; and aggregating the aggregated embedding vectors for each of the plurality of relevant neighbor nodes to generate the aggregated neighborhood embedding information is based at least in part on the importance weightings for each of the plurality of relevant neighbor nodes. 6. The computing system of claim 1 , wherein the plurality of relevant neighbor nodes includes at least one of: a predetermined number of nodes; a predetermined percentage of nodes of the neighbor nodes; a first plurality of nodes of the neighbor nodes that exceed a predetermined visit threshold; or a second plurality of nodes of the neighbor nodes that collectively exceed a predetermined cumulative visit threshold. 7. A computer-implemented method, comprising: determining a plurality of target nodes from a plurality of nodes associated with a corpus graph, wherein each node of the plurality of nodes is associated with a respective content item and a respective embedding vector that is representative of the respective content item and is generated by a deep neural network; for each of the plurality of target nodes: performing a random walk to determine a plurality of relevant neighbor nodes associated with the target node, wherein the random walk includes performing a plurality of random traversals to a plurality of neighbor nodes and incrementing a count associated with each of the plurality of neighbor nodes to which the plurality of random traversals is performed, the count representing a number of random traversals to each corresponding neighbor node of the plurality of neighbor nodes; and determining the plurality of relevant neighbor nodes from the plurality of neighbor nodes based on the neighbor nodes having a highest count; and in response to determining that a query corresponds to a first target node of the plurality of target nodes, providing at least one item of content corresponding to at least one node of the corpus graph as responsive to the query, the at least one node determined based at least in part on information associated with the plurality of relevant neighbor nodes of the first target node. 8. The computer-implemented method of claim 7 , wherein the plurality of random traversals is performed from a current node that includes one of the target node or a first neighbor node of the plurality of neighbor nodes to a second neighbor node of the plurality of neighbor nodes. 9. The computer-implemented method of claim 8 , further comprising: randomly determining that the current node includes one of the target node or the first neighbor node of the plurality of neighbor nodes. 10. The computer-implemented method of claim 8 , wherein the random walk further includes: prior to incrementing the count, determining that a first random traversal of the plurality of the random traversals to the second neighbor node is a first visit to the second neighbor node; and in response to the determination that the first random traversal to the second neighbor node is the first visit to the second neighbor node, adding the second neighbor node to a visited list and initializing the count associated with the second neighbor node. 11. The computer-implemented method of claim 8 , wherein the plurality of random traversals is determined based at least in part on a frequency of connections between the current node and the second neighbor node. 12. The computer-implemented method of claim 7 , wherein the plurality of relevant neighbor nodes includes at least one of: a predetermined number of nodes of the plurality of neighbor nodes; a predetermined percentage of nodes of the plurality of neighbor nodes; a first plurality of nodes of the plurality of neighbor nodes that exceed a predetermined visit threshold; or a second plurality of nodes of the plurality of neighbor nodes that collectively exceed a predetermined cumulative visit threshold. 13. The computer-implemented method of claim 7 , further comprising: for each of the plurality of target nodes: generating an aggregated embedding vector for each of the plurality of relevant neighbor nodes based at least in part on the respective embedding vectors associated with each of the plurality of relevant neighbor nodes; aggregating the aggregated embedding vectors for each of the plurality of relevant neighbor nodes to generate aggregated neighborhood embedding information associated with the plurality of relevant neighbor nodes; and associating the aggregated neighborhood embedding information with the target node. 14. The computer-implemented me

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11922308B2 cover?
Systems and methods for generating embeddings for nodes of a corpus graph are presented. More particularly, operations for generation of an aggregated embedding vector for a target node is efficiently divided among operations on a central processing unit and operations on a graphic processing unit. With regard to a target node within a corpus graph, processing by one or more central processing …
Who is the assignee on this patent?
Pinterest Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 05 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).