Machine learning-based relationship association and related discovery and

US11222052B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11222052-B2
Application numberUS-201916422674-A
CountryUS
Kind codeB2
Filing dateMay 24, 2019
Priority dateFeb 22, 2011
Publication dateJan 11, 2022
Grant dateJan 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and techniques for determining relationships and association significance between entities are disclosed. The systems and techniques automatically identify supply chain relationships between companies based on unstructured text corpora. The system combines Machine Learning models to identify sentences mentioning supply chain between two companies (evidence), and an aggregation layer to take into account the evidence found and assign a confidence score to the relationship between companies.

First claim

Opening claim text (preview).

What is claimed is: 1. A system including a central server connected to a plurality of remote devices over a communications network, the system comprising: a directed graph data store comprising a plurality of directed graphs, each directed graph respectively related to an entity associated with a primary identifier, and including a first directed graph related to a first entity associated with a first primary identifier and comprising a set of relationship data and a first entity node representing the first entity; an input connected to the communications network to receive a plurality of electronic documents comprising unstructured text; a machine-learning classifier applying a machine learning-based algorithm to: extract, from the unstructured text of the electronic documents, a subject entity, a predicate relationship, and an object; and generate a triple comprising the subject entity, the predicate relationship, and the object; a graph-based data model when executed by the central server configured to generate a second directed graph based in part on the triple and associate the first primary identifier with the subject entity, the second directed graph comprising a subject entity node representing the subject entity, a vertex representing the predicate relationship, and an object node representing the object; and a semantic web toolkit comprising and applying one or more of a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), a shapes constraint language (“SHACL”), a resource description framework schema (“RDFS”), and a SPARQL protocol and RDF query language (“SPARQL”) to compare the triple to the set of relationship data in the first directed graph and combine the subject entity node with the first entity node based on the first primary identifier and merge the first directed graph with the second directed graph to generate a third directed graph representing a data shape, and store the third directed graph in the directed graph data store. 2. The system of claim 1 , wherein the graph-based data model is optimized based on one of content type, metadata information, or use case. 3. The system of claim 1 , wherein each directed graph in the plurality of directed graphs is associated with a content set, the content set comprising a set of example data fields and a set of example relationships. 4. The system of claim 1 , wherein the plurality of directed graphs comprises Knowledge Graphs. 5. The system of claim 1 , further comprising a display module adapted to provide a user interface comprising the first directed graph, the second directed graph, and the third directed graph. 6. The system of claim 1 , wherein the first primary identifier is one of a Uniform Resource Identifier or a PermID. 7. The system of claim 1 , wherein the semantic web toolkit is a semantic web toolkit comprising a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), and a shapes constraint language (“SHACL”). 8. The system of claim 7 , wherein the OWL is adapted to identify an additional relationship between one or more of the subject entity, the predicate relationship, the object, the first entity, and the set of relationship data based on one or both of a relationship restriction and a relationship inverse. 9. The system of claim 8 , wherein the SHACL is adapted to constrain the merging of the first directed graph and the second directed graph based on a set of defined constraints. 10. The system of claim 1 , wherein the semantic web toolkit is a semantic web toolkit comprising a resource description framework schema (“RDFS”) and a SPARQL protocol and RDF query language (“SPARQL”). 11. A method of providing remote users connected to a central server over a communications network semantically-defined relationship information for a set of entities, the method comprising: storing a plurality of directed graphs in a directed graph data store, each directed graph related to an entity identified by a primary identifier, and including a first directed graph related to a first entity associated with a first primary identifier and comprising a set of relationship data and a first entity node representing the first entity; receiving via the communications network electronic documents comprising unstructured text; extracting, by a machine-learning classifier applying a machine learning-based algorithm, a subject entity, a predicate relationship, and an object from the unstructured text of the received electronic documents; generating, by the machine-learning classifier applying a machine learning-based algorithm, a triple comprising the subject entity, the predicate relationship, and the object; generating, by a graph-based data model executed by the central server, a second directed graph based in part on the triple and comprising a subject entity node representing the subject entity, a vertex representing the predicate relationship, and an object node representing the object; associating, by the graph-based data model executed by the central server, the first primary identifier with the subject entity; merging, by a semantic web toolkit, the first directed graph with the second directed graph to generate a third directed graph, the merging based on comparing the triple to the set of relationship data in the first directed graph and combining the subject entity node with the first entity node based on the first primary identifier, wherein the third directed graph generated by the merging of the first directed graph and the second directed graph represents a data shape; and storing the third directed graph in the directed graph data store. 12. The method of claim 11 , wherein the graph-based data model is optimized based on one of content type, metadata information, or use case. 13. The method of claim 11 , wherein each directed graph in the plurality of directed graphs is associated with a content set, the content set comprising a set of example data fields and a set of example relationships. 14. The method of claim 11 , wherein the plurality of directed graphs comprises Knowledge Graphs. 15. The method of claim 11 , further comprising providing, by a display module, a user interface comprising the first directed graph, the second directed graph, and the third directed graph. 16. The method of claim 11 , wherein the first primary identifier is one of a Uniform Resource Identifier or a PermID. 17. The method of claim 16 , wherein the semantic web toolkit is a semantic web toolkit comprising a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), and a shapes constraint language (“SHACL”). 18. The method of claim 17 , further comprising identifying, by the OWL, an additional relationship between one or more of the subject entity, the predicate relationship, the object, the first entity, and the set of relationship data based on one or both of a relationship restriction and a relationship inverse. 19. The method of claim 18 , further comprising constraining, by the SHACL, the merging of the first directed graph and the second directed graph based on a set of defined constraints. 20. The method of claim 11 , wherein the semantic web toolkit is a semantic web toolkit comprising a resource description framework schema (“RDFS”) and a SPARQL protocol and RDF query l

Assignees

Inventors

Classifications

  • G06F16/35Primary

    Clustering; Classification · CPC title

  • G06F16/334Primary

    Query execution (filtering based on additional data G06F16/335) · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title

  • Computing arrangements using knowledge-based models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11222052B2 cover?
Systems and techniques for determining relationships and association significance between entities are disclosed. The systems and techniques automatically identify supply chain relationships between companies based on unstructured text corpora. The system combines Machine Learning models to identify sentences mentioning supply chain between two companies (evidence), and an aggregation layer to …
Who is the assignee on this patent?
Refinitiv Us Organization Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/35. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).