Hierarchical ontology matching with self-supervision

US12242803B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12242803-B2
Application numberUS-202217853310-A
CountryUS
Kind codeB2
Filing dateJun 29, 2022
Priority dateJun 29, 2022
Publication dateMar 4, 2025
Grant dateMar 4, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An ontology matching system performs operations to refine a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a latent space. The operations include performing at least a first round of triplet loss training to decrease separation between select pairs of the embeddings sampled from the different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation. The system then determines, from the refined NLP model, a stable matching scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology. Responsive to receiving terms of the first hierarchical ontology from an application, the system uses the stable matching scheme to map each of the terms to corresponding terms of the second hierarchical ontology.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method for mapping terms across different ontologies, the method comprising: refining a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a vector space in which spatial proximity between the embeddings is correlated with similarity between the associated terms, the refining including a first round of triplet loss training effective to decrease a separation between select pairs of the embeddings sampled from different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation; determining, from the NLP model, a stable matching scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology; receiving a group of terms of the first hierarchical ontology; and mapping, based on the stable matching scheme, each term in the group of terms of the first hierarchical ontology to its associated corresponding term of the second hierarchical ontology; and returning the corresponding terms from the second hierarchical ontology. 2. The processor-implemented method of claim 1 , wherein determining the stable matching scheme further comprises: constructing a bipartite graph with nodes and edges connecting respective pairs of the nodes, the nodes being the embeddings of the first hierarchical ontology and the second hierarchical ontology and each edge of the edges having an edge weight representing a similarity between respective nodes forming endpoints of the edge; determining the stable matching scheme for the nodes in the bipartite graph. 3. The method of claim 2 , wherein the bipartite graph includes edges connecting each node of the first hierarchical ontology to one or more nodes of the second hierarchical ontology, and wherein an edge weight of each edge in the bipartite graph is determined by computing a similarity metric with respect to embeddings corresponding to nodes coupled to the edge. 4. The processor-implemented method of claim 1 , wherein the group of terms of the first hierarchical ontology are associated with metadata of a digital content file and wherein the corresponding terms from the second hierarchical ontology are added to the metadata of the digital content file. 5. The method of claim 1 , wherein performing the first round of triplet loss training decreases separation between select pairs of the embeddings sampled from different ontologies that share a like-named child node while increasing separation between other pairs of the embeddings that do not share a like-named child node. 6. The method of claim 1 , wherein performing the first round of triplet loss training includes selecting triplets from the embeddings of the NLP model that each identify an anchor node and a positive node selected from different hierarchical ontologies, the anchor node and the positive node sharing a like-named child node that is not shared with a negative node of the triplet. 7. The method of claim 1 , further comprising: performing a second round of triplet loss training to further refine the NLP model, wherein triplets selected for the second round of triplet loss training each include an anchor node and a positive node that share a parent node within a same hierarchical ontology and further include a negative node that is selected from the same hierarchical ontology, wherein the negative node does not share a parent node with either the anchor node or the positive node. 8. The method of claim 1 , wherein determining the stable matching scheme further comprises executing a stable marriage algorithm. 9. A system comprising: a processing system; memory; a natural language processing (NLP) model refinement tool stored in the memory and executable by the processing system to refine an NLP model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a vector space in which distances between the embeddings correlate with similarity between the associated terms, the refining of the NLP model including at least: performing a first round of triplet loss training effective to decrease separation between select pairs of the embeddings sampled from different ontologies that satisfy a first hierarchical relation while increasing separation between other pairs of the embeddings that do not satisfy the first hierarchical relation; a stable match identifier stored in the memory and executable by the processing system that determines a stable matching scheme for the embeddings of the refined NLP model, the stable matching scheme being a scheme that matches each term in the first hierarchical ontology with a corresponding term of the second hierarchical ontology; and an ontology translation engine stored in the memory and executable by the processing system to: receive terms of the first hierarchical ontology from an application; use the stable matching scheme to map each of the terms of the first hierarchical ontology to its associated corresponding term of the second hierarchical ontology; and returning the corresponding terms from the second hierarchical ontology to the application. 10. The system of claim 9 , wherein the stable match identifier is further executable to: construct a bipartite graph with nodes and edges connecting respective pairs of the nodes, the nodes being the embeddings of the first hierarchical ontology and the second hierarchical ontology and each edge of the edges having an edge weight representing a similarity between respective nodes forming endpoints of the edge; and determine the stable matching scheme for the nodes in the bipartite graph. 11. The system of claim 10 , wherein the stable match identifier executes a stable marriage algorithm to identify the stable matching scheme. 12. The system of claim 10 , wherein the bipartite graph includes edges connecting each of the nodes of the first hierarchical ontology to one or more nodes of the second hierarchical ontology, and wherein an edge weight of each edge in the bipartite graph is determined by computing a similarity metric with respect to embeddings corresponding to nodes coupled to the edge. 13. The system of claim 9 , wherein the terms of the first hierarchical ontology are associated with metadata of a digital content file and wherein the corresponding terms from the second hierarchical ontology are added to the metadata of the digital content file by the application. 14. The system of claim 9 , wherein performing the first round of triplet loss training includes selecting triplets from the embeddings of the NLP model that each identify an anchor node and a positive node selected from different hierarchical ontologies, the anchor node and the positive node sharing a like-named child node that is not shared with a negative node of the triplet. 15. The system of claim 9 , wherein the NLP model refinement tool is further executable to: performing a second round of triplet loss training to further refine the NLP model, wherein triplets selected for the second round of triplet loss training each include an anchor node and a positive node that share a parent node within a same hierarchical ontology and further include a negative node that is selected from the same hierarchical ontology, wherein the negative node that does not share a parent node with either the anchor node or the positive node. 16. A tangible computer-readable storage media encoding computer exec

Assignees

Inventors

Classifications

  • Creation of semantic tools, e.g. ontology or thesauri · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • using natural language analysis · CPC title

  • Selection or weighting of terms for indexing · CPC title

  • G06F16/367Primary

    Ontology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12242803B2 cover?
An ontology matching system performs operations to refine a natural language processing (NLP) model that encodes terms of a first hierarchical ontology and of a second hierarchical ontology as embeddings in a latent space. The operations include performing at least a first round of triplet loss training to decrease separation between select pairs of the embeddings sampled from the different ont…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/367. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).