Generating neighborhood convolutions within a large network

US11227013B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11227013-B2
Application numberUS-201916273939-A
CountryUS
Kind codeB2
Filing dateFeb 12, 2019
Priority dateMar 13, 2018
Publication dateJan 18, 2022
Grant dateJan 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for generating embeddings for nodes of a corpus graph are presented. More particularly, operations for generation of an aggregated embedding vector for a target node is efficiently divided among operations on a central processing unit and operations on a graphic processing unit. With regard to a target node within a corpus graph, processing by one or more central processing units (CPUs) is conducted to identify the target node's relevant neighborhood (of nodes) within the corpus graph. This information is prepared and passed to one or more graphic processing units (GPUs) that determines the aggregated embedding vector for the target node according to data of the relevant neighborhood of the target node.

First claim

Opening claim text (preview).

What is claimed: 1. A computer-implemented method, comprising: under a control of one or more processors executing on a computer system: maintaining, in a data store of the computer system, a corpus graph comprising a plurality of nodes; accessing a target node of the plurality of nodes; for a predetermined number of iterations, and starting with the target node as a current node: randomly selecting, as a neighbor node for the current node, a node of the plurality of nodes to which the current node is connected in the corpus graph; incrementing a count corresponding to the neighbor node in a visit list maintained on the computer system for the target node, where the incremented count indicates a traversal count to the neighbor node from another node in the corpus graph during the predetermined number of iterations; and establishing the neighbor node as the current node; identifying a subset of nodes of the neighbor nodes referenced by the visit list having highest counts; storing, in the data store in association with the target node, information identifying the subset of nodes as a relevant neighborhood of the target node; and in response to determining that a query corresponds to the target node, providing at least one item of content corresponding to at least one node as responsive to the query, the at least one node determined based at least in part on the information identifying the subset of nodes. 2. The computer-implemented method of claim 1 , further comprising: prior to randomly selecting the neighbor node for the current node: randomly determining whether to reset the current node to the target node; and upon randomly determining to reset the current node to the target node, resetting the current node as the target node. 3. The computer-implemented method of claim 1 , wherein the subset of nodes comprises a predetermined number of nodes referenced by the visit list having the highest visit counts. 4. The computer-implemented method of claim 1 , wherein the subset of nodes comprises a predetermined percentage of nodes referenced by the visit list having the highest visit counts. 5. The computer-implemented method of claim 1 , wherein the subset of nodes comprises nodes referenced by the visit list whose visit counts meet or exceed a predetermined threshold. 6. The computer-implemented method of claim 1 , wherein the subset of nodes comprises nodes referenced by the visit list that have the highest visit counts and collectively represent a predetermined threshold percentage of all visits during the predetermined number of iterations. 7. The computer-implemented method of claim 1 , further comprising: aggregating neighborhood embedding information according to information corresponding to the nodes of the subset of nodes; and aggregating neighborhood embedding information in the data store in association with the target node. 8. A non-transitory computer-readable medium bearing computer executable instructions which, when executed on a computing system comprising at least a processor, carry out a method, comprising: maintaining, in a data store of the computing system, a corpus graph comprising a plurality of nodes; accessing a target node of the plurality of nodes; for a predetermined number of iterations, starting with the target node as a current node: randomly selecting, as a neighbor node for the current node, a node of the plurality of nodes to which the current node is connected in the corpus graph; incrementing a count corresponding to the neighbor node in a visit list maintained on the computing system for the target node, where the incremented count indicates a traversal count to the neighbor node from another node in the corpus graph during the predetermined number of iterations; and establishing the neighbor node as the current node; identifying a subset of nodes of the neighbor nodes referenced by the visit list having highest visit counts; aggregating neighborhood embedding information according to information of the subset of nodes; storing the subset of nodes as the neighbor nodes of a relevant neighborhood of the target node and the aggregated neighborhood embedding information in the data store in association with the target node; and in response to determining that a query corresponds to the target node, providing at least one item of content corresponding to at least one node as responsive to the query, the at least one node determined based at least in part on the aggregated neighborhood embedding information. 9. The non-transitory computer-readable medium of claim 8 , wherein carrying out the method further comprises: prior to randomly selecting a neighbor node for the current node: randomly determining to reset the current node to the target node; and upon randomly determining to reset the current node to the target node, resetting the current node as the target node. 10. The non-transitory computer-readable medium of claim 8 , wherein the subset of nodes comprises a predetermined number of nodes referenced by the visit list having the highest visit counts. 11. The non-transitory computer-readable medium of claim 8 , wherein the subset of nodes comprises a predetermined percentage of nodes referenced by the visit list having the highest visit counts. 12. The non-transitory computer-readable medium of claim 8 , wherein the subset of nodes comprises nodes referenced by the visit list whose visit counts meet or exceed a predetermined threshold. 13. The non-transitory computer-readable medium of claim 8 , wherein the subset of nodes comprises nodes referenced by the visit list that have the highest visit counts and collectively represent a predetermined threshold percentage of all visits during the predetermined number of iterations. 14. A computer system comprising a processor and a memory, wherein the processor, in executing instructions stored in the memory, causes the computer system to, at least: maintain a data store storing a corpus graph comprising a plurality of nodes; access a target node of the plurality of nodes; for a predetermined number of iterations, and starting with the target node as a current node: randomly select, as a neighbor node for the current node, a node of the plurality of nodes to which the current node is connected in the corpus graph; increment a count corresponding to the neighbor node in a visit list maintained on the computer system for the target node, where the incremented count indicates a traversal count to the neighbor node from another node in the corpus graph during the predetermined number of iterations; and establish the neighbor node as the current node; identify a subset of nodes of the neighbor nodes referenced by the visit list having highest visit counts; store, in the data store in association with the target node, information identifying the subset of nodes as the relevant neighborhood of the target node; and in response to a determination that a query corresponds to the target node, provide at least one item of content corresponding to at least one node as responsive to the query, the at least one node determined based at least in part on the information identifying the subset of nodes. 15. The computer system of claim 14 wherein, prior to the random selection of a neighbor node for the current node, the computer system is further configured to, at least: randomly determine to reset the current node to the target node; and upon a random determination to reset the current node to the target node, reset the current node as the target node. 16. The computer system

Assignees

Inventors

Classifications

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Graphical models, e.g. Bayesian networks · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Selection of the most significant subset of features · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11227013B2 cover?
Systems and methods for generating embeddings for nodes of a corpus graph are presented. More particularly, operations for generation of an aggregated embedding vector for a target node is efficiently divided among operations on a central processing unit and operations on a graphic processing unit. With regard to a target node within a corpus graph, processing by one or more central processing …
Who is the assignee on this patent?
Pinterest Inc, Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/9024. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).