Systems and methods providing evolutionary generation of embeddings for predicting links in knowledge graphs

US2021103826A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021103826-A1
Application numberUS-201916590738-A
CountryUS
Kind codeA1
Filing dateOct 2, 2019
Priority dateOct 2, 2019
Publication dateApr 8, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Complex computer system architectures are described for analyzing data elements of a knowledge graph, and predicting new facts from relational learning applied to the knowledge graph. This discovery process includes converting the knowledge graph into a set of candidate embeddings spaces to apply further analysis to rank the set of candidate embeddings spaces, where the top ranked candidate embeddings spaces are further processed to identify the new facts.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing device comprising: a data reception circuitry configured to receive a knowledge graph including a set of structured data; a knowledge graph embedding circuitry configured to: select a sampled set of vector triples from the knowledge graph; convert the knowledge graph to a set of candidate embeddings spaces, wherein each of the candidate embeddings spaces includes a different set of vectors representing the sampled set of vector triples from the knowledge graph; and a processing circuitry configured to: calculate a weighted mean reciprocal rank of the sampled set of vector triples for each of the candidate embeddings spaces; determine a respective fact score for each of the candidate embeddings spaces based on the calculated weighted mean reciprocal rank; rank the candidate embeddings spaces according to their respective fact scores; and select a predetermined number of the highest ranked candidate embeddings spaces when a stopping condition is met. 2 . The computing device of claim 1 , wherein the processing circuitry is further configured to: obtain a set of candidate vector triples, wherein each vector triple represents a different candidate fact; determine prediction scores for the set of candidate vector triples in view of the selected highest ranked candidate embeddings spaces; and select a vector triple from the set of candidate vector triples as a new fact based on the prediction scores. 3 . The computing device of claim 1 , wherein the stopping condition includes at least one of when a fact score of at least one of the candidate embeddings spaces if greater than a predetermined selection threshold, when a predetermined period has passed, or when an improvement in the fact scores for plurality of candidate embeddings spaces is less than a predetermined amount compared to a previous iteration. 4 . The computing device of claim 1 , wherein the processing circuitry is further configured to: generate a new set of candidate embeddings spaces according to a search strategy when a stopping condition is not met, wherein the search strategy comprises both an exploration search strategy and an exploitation search strategy; calculate a weighted mean reciprocal rank of the sampled set of vector triples for each of the new set of candidate embeddings spaces; determine a respective fact score for each of the new set of candidate embeddings spaces based on the weighted mean reciprocal rank calculation; rank the new set of candidate embeddings spaces according to their respective fact scores; and select the predetermined number of the highest ranked new candidate embeddings spaces when the stopping condition is met. 5 . The computing device of claim 4 , wherein the processing circuitry is configured to generate a new candidate embeddings space according to the exploration search strategy by combining at least two candidate embeddings spaces from the selected highest ranked candidate embeddings spaces as a weighted average of the at least two candidate embeddings spaces from the selected highest ranked candidate embeddings spaces. 6 . The computing device of claim 5 , wherein the processing circuitry is configured to determine a weighted average of nodes in each of the at least two candidate embeddings spaces. 7 . The computing device of claim 4 , wherein the processing circuitry is configured to generate a new candidate embeddings space according to the exploitation search strategy by modifying a position of a vector included in a highest ranked candidate embeddings space. 8 . The computing device of claim 7 , wherein the processing circuitry is configured to determine the modified position of the vector based on a position of other connected nodes. 9 . The computing device of claim 4 , wherein the processing circuitry is configured to generate the new set of candidate embeddings spaces according to a predetermined ratio that includes a first predetermined number of new candidate embeddings spaces using the exploration search strategy and a second predetermined number of new candidate embeddings spaces using the exploitation search strategy. 10 . The computing device of claim 1 , wherein the processing circuitry is further configured to: determine an initial ranking score for each combination of the sampled set of vector triples for each of the candidate embeddings spaces; determine a mean of the initial ranking scores for each of the candidate embeddings spaces; determine a corner case weighting for each combination of the sampled set of vector triples; and wherein the weighted mean reciprocal rank is calculated based on the mean of the initial ranking scores and the corner case weighting. 11 . A method comprising: receiving, by a reception circuitry, a knowledge graph including a set of structured data; selecting, by a knowledge graph embedding circuitry, a sampled set of vector triples from the knowledge graph; converting, by the knowledge graph embedding circuitry, the knowledge graph to a set of candidate embeddings spaces, wherein each of the candidate embeddings spaces includes a different set of vectors representing the sampled set of vector triples from the knowledge graph; calculating, by a processing circuitry, a weighted mean reciprocal rank of the sampled set of vector triples for each of the candidate embeddings spaces; determining, by the processing circuitry, a respective fact score for each of the candidate embeddings spaces based on the calculated weighted mean reciprocal rank; ranking, by the processing circuitry, the candidate embeddings spaces according to their respective fact scores; and selecting, by the processing circuitry, a predetermined number of the highest ranked candidate embeddings spaces when a stopping condition is met. 12 . The method of claim 11 , further comprising: obtaining, by the processing circuitry, a set of candidate vector triples, wherein each vector triple represents a different candidate fact; determining, by the processing circuitry, prediction scores for the set of candidate vector triples in view of the selected highest ranked candidate embeddings spaces; and selecting, by the processing circuitry, a vector triple from the set of candidate vector triples as a new fact based on the prediction scores. 13 . The method of claim 11 , wherein the stopping condition includes at least one of when a fact score of at least one of the candidate embeddings spaces if greater than a predetermined selection threshold, when a predetermined period has passed, or when an improvement in the fact scores for plurality of candidate embeddings spaces is less than a predetermined amount compared to a previous iteration. 14 . The method of claim 11 , further comprising: generating, by the processing circuitry, a new set of candidate embeddings spaces according to a search strategy when a stopping condition is not met, wherein the search strategy comprises both an exploration search strategy and an exploitation search strategy; calculating, by the processing circuitry, a weighted mean reciprocal rank of the sampled set of vector triples for each of the new set of candidate embeddings spaces; determining, by the processing circuitry, a respective fact score for each of the new set of candidate embeddings spaces based on the weighted mean reciprocal rank calculation; ranking, by the processing circuitry, the new set of candidate embeddings spaces according to their respective fact scores; and selecting, by the processing circuitry, the predetermined number of the highest ranked new candidate embeddings spaces whe

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • Knowledge engineering; Knowledge acquisition · CPC title

  • Complex mathematical operations {(function generation by table look-up G06F1/03; evaluation of elementary functions by calculation G06F7/544)} · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • G06N5/02Primary

    Knowledge representation; Symbolic representation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021103826A1 cover?
Complex computer system architectures are described for analyzing data elements of a knowledge graph, and predicting new facts from relational learning applied to the knowledge graph. This discovery process includes converting the knowledge graph into a set of candidate embeddings spaces to apply further analysis to rank the set of candidate embeddings spaces, where the top ranked candidate emb…
Who is the assignee on this patent?
Accenture Global Solutions Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/9024. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 08 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).