Ontology-based data storage for distributed knowledge bases

US12387112B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12387112-B2
Application numberUS-201916594391-A
CountryUS
Kind codeB2
Filing dateOct 7, 2019
Priority dateOct 7, 2019
Publication dateAug 12, 2025
Grant dateAug 12, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for distributed data placement are provided. Query workload information corresponding to a domain is determined by a data orchestrator, and the query workload information is modeled as a hypergraph, where the hypergraph includes a set of vertices and a set of hyperedges, where each vertex in the set of vertices corresponds to a concept in an ontology associated with the domain. Mappings are generated between concepts and a plurality of data nodes based on the hypergraph and based further on predefined capability of each of the plurality of data nodes. A distributed knowledge base is established based on the generated mappings.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: determining, by a data orchestrator, query workload information corresponding to a domain, the query workload information comprising ontological queries specifying respective concepts in an ontology associated with the domain and respective operations performed by the ontological queries, the ontological queries comprising a first group with corresponding matching concepts that are accessed, the first group performing an aggregate set of operations; modeling the query workload information as a hypergraph, wherein the hypergraph includes a plurality of vertices and a plurality of hyperedges, wherein each respective vertex in the plurality of vertices corresponds to a respective concept in the ontology and each respective hyperedge in the plurality of hyperedges indicates a respective set of operations applied to concepts associated with the respective hyperedge, wherein a first hyperedge of the plurality of hyperedges is labelled with the aggregate set of operations and connects first vertices representing the matching concepts; generating mappings between concepts in the ontology and a plurality of data stores based on the hypergraph, wherein: the plurality of data stores are distributed over multiple data sites and communicate with each other via data transmissions sent over a network, and the mappings indicate, for each respective concept in the ontology, a respective subset of the plurality of data stores on which the respective concept is to be stored; and establishing a distributed knowledge base based on the generated mappings, comprising: storing data associated with concepts of the ontology among the plurality of data stores based on the generated mappings; storing data associated with a first concept on a first data store of the plurality of data stores; and storing data associated with a second concept in the ontology on a second data store of the plurality of data stores, wherein the first data store does not store the data associated with the second concept and the second data store does not store the data associated with the first concept. 2. The method of claim 1 , wherein determining the query workload information comprises: receiving a set of prior ontological queries; generating a first set of concepts accessed by a first query in the set of prior ontological queries; generating a first set of operations performed by the first query; and generating a first summarized query by: identifying the first group of queries, from the set of prior ontological queries, with corresponding matching first sets; determining, based on corresponding sets of operations for each query in the first group of queries, the aggregate set of operations; and associating the first summarized query with the aggregate set of operations and concepts reflected in the corresponding matching first sets. 3. The method of claim 2 , wherein modeling the query workload information as a hypergraph comprises: creating a vertex for each concept in the ontology; creating the first hyperedge for the first summarized query, wherein the first hyperedge connects a first set of vertices in the hypergraph, wherein the first set of vertices corresponds to the concepts reflected in the matching first sets; and labeling the first hyperedge with the aggregate set of operations. 4. The method of claim 1 , wherein generating the mappings comprises: creating a first cluster for a first operation included in the hypergraph; identifying a first set of concepts connected by a first hyperedge in the hypergraph; identifying a first set of operations indicated by the first hyperedge; and upon determining that the first set of operations includes the first operation, assigning the first set of concepts to the first cluster. 5. The method of claim 4 , wherein generating the mappings further comprises mapping the first set of concepts to one or more data stores by: identifying a set of data stores capable of performing the first operation; and mapping each concept in the first set of concepts to each data store in the identified set of data stores. 6. The method of claim 1 , wherein generating the mappings comprises: identifying a first set of concepts connected by a first hyperedge in the hypergraph; identifying a first set of operations indicated by the first hyperedge; determining a minimum set of data stores capable of collectively performing the first set of operations; generating a cluster including the first set of concepts; and labeling the cluster with the minimum set of data stores. 7. The method of claim 6 , wherein generating the mappings further comprises mapping each concept in the first set of concepts to each data store in the minimum set of data stores. 8. The method of claim 1 , wherein establishing the distributed knowledge base comprises, for each respective concept in the ontology: identifying a respective data store indicated by the mappings; identifying data corresponding to the respective concept; and facilitating storage of the identified data in the respective data store. 9. A computer-readable storage medium containing computer program code that, when executed by operation of one or more computer processors, performs an operation comprising: determining, by a data orchestrator, query workload information corresponding to a domain, the query workload information comprising at least a first query specifying a first concept in an ontology associated with the domain and a first operation performed on the first concept based on the first query; modeling the query workload information as a hypergraph, wherein the hypergraph includes a plurality of vertices and a plurality of hyperedges, wherein each respective vertex in the plurality of vertices corresponds to a respective concept in the ontology and each respective hyperedge in the plurality of hyperedges indicates a respective set of operations applied to concepts associated with the respective hyperedge; generating mappings between concepts in the ontology and a plurality of data stores based on the hypergraph, wherein: the plurality of data stores are distributed over multiple data sites and communicate with each other via data transmissions sent over a network, and the mappings indicate, for each respective concept in the ontology, a respective subset of the plurality of data stores on which the respective concept is to be stored; and establishing a distributed knowledge base based on the generated mappings, comprising: storing data associated with concepts of the ontology among the plurality of data stores based on the generated mappings; storing data associated with the first concept on a first data store of the plurality of data stores; storing data associated with a second concept in the ontology on a second data store of the plurality of data stores, wherein the first data store does not store the data associated with the second concept and the second data store does not store the data associated with the first concept; identifying a first set of concepts connected by a first hyperedge in the hypergraph; identifying a first set of operations indicated by the first hyperedge, the first set of operations comprising a first operation; identifying a set of the data stores capable of performing the first operation; and mapping each concept in the first set of concepts to data stores in the identified set of data stores. 10. The computer-readable storage medium of claim 9 , wherein determining the query workload information comprises: receiving a set of prior ontological queries; generating a first set of concepts accessed by a first query in the set of prior ontological quer

Assignees

Inventors

Classifications

  • based on graph theory, e.g. minimum spanning trees [MST] or graph cuts · CPC title

  • Filtering based on additional data, e.g. user or group profiles · CPC title

  • Presentation of query results · CPC title

  • organised in groups of units sharing resources, e.g. clusters · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12387112B2 cover?
Techniques for distributed data placement are provided. Query workload information corresponding to a domain is determined by a data orchestrator, and the query workload information is modeled as a hypergraph, where the hypergraph includes a set of vertices and a set of hyperedges, where each vertex in the set of vertices corresponds to a concept in an ontology associated with the domain. Mappi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 12 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).