Sorting documents according to comprehensibility scores determined for the documents
US-2024119078-A1 · Apr 11, 2024 · US
US11222052B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11222052-B2 |
| Application number | US-201916422674-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 24, 2019 |
| Priority date | Feb 22, 2011 |
| Publication date | Jan 11, 2022 |
| Grant date | Jan 11, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and techniques for determining relationships and association significance between entities are disclosed. The systems and techniques automatically identify supply chain relationships between companies based on unstructured text corpora. The system combines Machine Learning models to identify sentences mentioning supply chain between two companies (evidence), and an aggregation layer to take into account the evidence found and assign a confidence score to the relationship between companies.
Opening claim text (preview).
What is claimed is: 1. A system including a central server connected to a plurality of remote devices over a communications network, the system comprising: a directed graph data store comprising a plurality of directed graphs, each directed graph respectively related to an entity associated with a primary identifier, and including a first directed graph related to a first entity associated with a first primary identifier and comprising a set of relationship data and a first entity node representing the first entity; an input connected to the communications network to receive a plurality of electronic documents comprising unstructured text; a machine-learning classifier applying a machine learning-based algorithm to: extract, from the unstructured text of the electronic documents, a subject entity, a predicate relationship, and an object; and generate a triple comprising the subject entity, the predicate relationship, and the object; a graph-based data model when executed by the central server configured to generate a second directed graph based in part on the triple and associate the first primary identifier with the subject entity, the second directed graph comprising a subject entity node representing the subject entity, a vertex representing the predicate relationship, and an object node representing the object; and a semantic web toolkit comprising and applying one or more of a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), a shapes constraint language (“SHACL”), a resource description framework schema (“RDFS”), and a SPARQL protocol and RDF query language (“SPARQL”) to compare the triple to the set of relationship data in the first directed graph and combine the subject entity node with the first entity node based on the first primary identifier and merge the first directed graph with the second directed graph to generate a third directed graph representing a data shape, and store the third directed graph in the directed graph data store. 2. The system of claim 1 , wherein the graph-based data model is optimized based on one of content type, metadata information, or use case. 3. The system of claim 1 , wherein each directed graph in the plurality of directed graphs is associated with a content set, the content set comprising a set of example data fields and a set of example relationships. 4. The system of claim 1 , wherein the plurality of directed graphs comprises Knowledge Graphs. 5. The system of claim 1 , further comprising a display module adapted to provide a user interface comprising the first directed graph, the second directed graph, and the third directed graph. 6. The system of claim 1 , wherein the first primary identifier is one of a Uniform Resource Identifier or a PermID. 7. The system of claim 1 , wherein the semantic web toolkit is a semantic web toolkit comprising a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), and a shapes constraint language (“SHACL”). 8. The system of claim 7 , wherein the OWL is adapted to identify an additional relationship between one or more of the subject entity, the predicate relationship, the object, the first entity, and the set of relationship data based on one or both of a relationship restriction and a relationship inverse. 9. The system of claim 8 , wherein the SHACL is adapted to constrain the merging of the first directed graph and the second directed graph based on a set of defined constraints. 10. The system of claim 1 , wherein the semantic web toolkit is a semantic web toolkit comprising a resource description framework schema (“RDFS”) and a SPARQL protocol and RDF query language (“SPARQL”). 11. A method of providing remote users connected to a central server over a communications network semantically-defined relationship information for a set of entities, the method comprising: storing a plurality of directed graphs in a directed graph data store, each directed graph related to an entity identified by a primary identifier, and including a first directed graph related to a first entity associated with a first primary identifier and comprising a set of relationship data and a first entity node representing the first entity; receiving via the communications network electronic documents comprising unstructured text; extracting, by a machine-learning classifier applying a machine learning-based algorithm, a subject entity, a predicate relationship, and an object from the unstructured text of the received electronic documents; generating, by the machine-learning classifier applying a machine learning-based algorithm, a triple comprising the subject entity, the predicate relationship, and the object; generating, by a graph-based data model executed by the central server, a second directed graph based in part on the triple and comprising a subject entity node representing the subject entity, a vertex representing the predicate relationship, and an object node representing the object; associating, by the graph-based data model executed by the central server, the first primary identifier with the subject entity; merging, by a semantic web toolkit, the first directed graph with the second directed graph to generate a third directed graph, the merging based on comparing the triple to the set of relationship data in the first directed graph and combining the subject entity node with the first entity node based on the first primary identifier, wherein the third directed graph generated by the merging of the first directed graph and the second directed graph represents a data shape; and storing the third directed graph in the directed graph data store. 12. The method of claim 11 , wherein the graph-based data model is optimized based on one of content type, metadata information, or use case. 13. The method of claim 11 , wherein each directed graph in the plurality of directed graphs is associated with a content set, the content set comprising a set of example data fields and a set of example relationships. 14. The method of claim 11 , wherein the plurality of directed graphs comprises Knowledge Graphs. 15. The method of claim 11 , further comprising providing, by a display module, a user interface comprising the first directed graph, the second directed graph, and the third directed graph. 16. The method of claim 11 , wherein the first primary identifier is one of a Uniform Resource Identifier or a PermID. 17. The method of claim 16 , wherein the semantic web toolkit is a semantic web toolkit comprising a uniform resource identifier (“URI”) system and language, a resource description framework (“RDF”) system, an open world assumption language (“OWL”), and a shapes constraint language (“SHACL”). 18. The method of claim 17 , further comprising identifying, by the OWL, an additional relationship between one or more of the subject entity, the predicate relationship, the object, the first entity, and the set of relationship data based on one or both of a relationship restriction and a relationship inverse. 19. The method of claim 18 , further comprising constraining, by the SHACL, the merging of the first directed graph and the second directed graph based on a set of defined constraints. 20. The method of claim 11 , wherein the semantic web toolkit is a semantic web toolkit comprising a resource description framework schema (“RDFS”) and a SPARQL protocol and RDF query l
Clustering; Classification · CPC title
Query execution (filtering based on additional data G06F16/335) · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Computing arrangements using knowledge-based models · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.