Relation graph optimization using inconsistent cycle detection

US10885452B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10885452-B1
Application numberUS-201615193883-A
CountryUS
Kind codeB1
Filing dateJun 27, 2016
Priority dateJun 27, 2016
Publication dateJan 5, 2021
Grant dateJan 5, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A first graph is generated from a text data set, with graph nodes representing named entities in the data set and edges representing relationships between the named entities, and with edge weights indicating confidence levels. At least one cycle of the graph may be designated as inconsistent using a rule set. An edge may be selected for deletion from the first graph based on its presence in an inconsistent cycle, the cycle's weight, and/or on the edge weight. A representation of relationships indicated in the modified graph is provided programmatically.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more computing devices of a network-accessible text analysis service; wherein the one or more computing devices are configured to: determine that relationship analysis is to performed on a data set comprising a plurality of text collections, wherein a particular text collection of the plurality of text collections includes (a) one or more tokens indicating a respective named entity and (b) one or more indicators of a respective relationship between one named entity and another named entity; generate, using the data set, a first graph comprising a plurality of nodes and a plurality of edges, wherein an individual node corresponds to a named entity identified from the plurality of text collections, wherein an individual edge indicates a relationship between a pair of named entities corresponding to a pair of nodes of the first graph linked by the edge, and wherein individual edges of the plurality of edges of the first graph, including a first edge, have a respective weight indicative of a confidence level associated with the relationship between the respective pair of named entities corresponding to the respective pair of nodes of the first graph linked by the respective edge; designate, based at least in part on a consistency rule set, at least a first cycle of the first graph as an inconsistent cycle, wherein the first cycle includes the first edge and a second edge; prune one or more edges, including the first edge, from the first graph to obtain a second graph, wherein the first edge is selected for pruning from the first graph based at least in part on inclusion of the first edge in the first cycle that was designated as the inconsistent cycle, and on the weight of the first edge indicative of the confidence level associated with the relationship between the pair of named entities corresponding to the pair of nodes of the first graph linked by the first edge; and provide, via a programmatic interface, a representation of one or more relationships indicated in the second graph, including a particular relationship corresponding to the second edge. 2. The system as recited in claim 1 , wherein individual ones of the text collections comprise respective sentences in an e-book, wherein the one or more computing devices are configured to: receive an indication via another programmatic interface that a reader of the e-book has (a) read a particular section of the e-book and (b) has requested supplementary information about the e-book; and provide the representation of one or more relationships as part of the supplementary information. 3. The system as recited in claim 1 , wherein the one or more computing devices are configured to: select the first edge for pruning using one or more of: (a) a belief propagation algorithm (b) a greedy edge retention algorithm. 4. The system as recited in claim 1 , wherein the one or more computing devices are configured to: receive, via the programmatic interface, a request from a client to analyze the data set, wherein the request indicates a source of the data set. 5. The system as recited in claim 1 , wherein the one or more computing devices are configured to: determine, using one or more of: (a) a multi-class logistic regression model, (b) a binary logic regression model, or (c) a neural network model, that a first text collection of the plurality of text collections indicates a first relationship with a first probability; and compute the weight of the first edge based at least in part on the first probability. 6. A method, comprising: performing, by one or more computing devices: generating, using a data set comprising a plurality of text collections, a first graph comprising a plurality of nodes and a plurality of edges, wherein an individual node corresponds to a named entity identified from the data set, wherein an individual edge indicates a relationship between a pair of named entities corresponding to a pair of nodes of the first graph linked by the edge, and wherein individual edges of the plurality of edges of the first graph, including a first edge, have a respective weight indicative of a confidence level associated with the relationship between the respective pair of named entities corresponding to the respective pair of nodes of the first graph linked by the respective edge; designating, based at least in part on a consistency rule set, at least a first cycle of the first graph as an inconsistent cycle, wherein the first cycle includes the first edge; pruning one or more edges, including the first edge, from the first graph to obtain a second graph, wherein the first edge is selected for pruning from the first graph based at least in part on inclusion of the first edge in the first cycle that was designated as the inconsistent cycle, and on the weight of the first edge indicative of the confidence level associated with the relationship between the pair of named entities corresponding to the pair of nodes of the first graph linked by the first edge; and providing, via a programmatic interface, a representation of one or more relationships indicated in the second graph. 7. The method as recited in claim 6 , wherein individual ones of the text collections comprise respective sentences in an e-book, further comprising: receiving, at the one or more computing devices, an indication via a programmatic interface that a reader of the e-book has (a) read a particular section of the e-book and (b) has requested supplementary information about the e-book; and providing, by the one or more computing devices, the representation of one or more relationships as part of the supplementary information. 8. The method as recited in claim 7 , further comprising: determining, at the one or more computing devices, that the reader of the e-book has read an additional section of the e-book; and providing, by the one or more computing devices, a second representation of relationships indicated in the e-book, wherein the second representation indicates a particular relationship determined from an analysis of the additional section of the e-book, and wherein the particular relationship is not included in the supplementary information provided to the reader before the second section was read by the reader. 9. The method as recited in claim 6 , wherein selecting the first edge for pruning comprises executing one or more of: (a) a belief propagation algorithm (b) a greedy edge retention algorithm. 10. The method as recited in claim 6 , further comprising performing, by the one or more computing devices: receiving, via a programmatic interface, a request from a client to analyze the data set, wherein the request indicates a source of the data set. 11. The method as recited in claim 6 , further comprising performing, by the one or more computing devices: receiving, via a programmatic interface, respective indicators to be used to detect individual ones of a plurality of relationship categories in the data set. 12. The method as recited in claim 6 , further comprising performing, by the one or more computing devices: receiving, via a programmatic interface, the consistency rule set. 13. The method as recited in claim 6 , wherein the first edge is selected for pruning based at least in part on an edge retention tradeoff parameter indicative of a relative priority assigned to edge weights with respect to inconsistency. 14. The method as recited in claim 6 , wherein said generating the first graph comprises: identifying a set of relationship categories to be used to analyze the data set; determining that a first text collection of the data s

Assignees

Inventors

Classifications

  • G06N7/01Primary

    Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Knowledge engineering; Knowledge acquisition · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Document management systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10885452B1 cover?
A first graph is generated from a text data set, with graph nodes representing named entities in the data set and edges representing relationships between the named entities, and with edge weights indicating confidence levels. At least one cycle of the graph may be designated as inconsistent using a rule set. An edge may be selected for deletion from the first graph based on its presence in an …
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N7/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).