Cross-domain structural mapping in machine learning processing

US12579111B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12579111-B2
Application numberUS-202017139190-A
CountryUS
Kind codeB2
Filing dateDec 31, 2020
Priority dateDec 31, 2020
Publication dateMar 17, 2026
Grant dateMar 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of using a computing device executing to interrelate two or more corpuses of dissimilar data that includes receiving input data from each of two or more corpuses of dissimilar data. The computing device computes a pass for each of the input data into two or more encoder-decoder models. The computing device further obtains a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models. The computing device additionally computes a distribution distance metric as an output from each of a low-dimensional embedding vector representation from each of the two or more encoder-decoder models. The computing device still further computes a function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics. The computing device additionally updates the two or more encoder-decoder models.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of using a computing device executing to interrelate two or more domain corpuses of dissimilar data, the method comprising: receiving input data from each of two or more domain corpuses of dissimilar data; computing, by the computing device, a pass for each of the input data; training, based on the pass for each of the input data, two or more encoder-decoder models selected from convolutional neural networks, recurrent neural networks, long short-term memory networks, self-attention networks; obtaining, by the computing device, a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models; computing, but the computing device, a pairwise mean relative living time (MRLT) distribution distance metric based on persistence intervals of simplicial complexes derived from low-dimensional embedding vector representations from each of the two or more encoder models; computing, by the computing device, a joint loss function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics; updating, by the computing device, based on results of the joint loss function, the training of the two or more encoder-decoder models, wherein the training comprises using backpropagation to minimize a reconstruction loss; determining, using an epsilon parameter, a ball radius to determine a representation of the simplicial complexes of an intra-domain relationship to perform mappings across inter-domain relationships; learning, by the two or more encoder-decoder models, how to reconstruct the data from encoded representations based on a number of holes and connected components relative to the ball radius; and determining, using the updated two or more encoder-decoder models and the ball radius, a cross-domain structural mapping interrelating dissimilar data of the two or more encoder-decoder models, wherein the cross-domain structural mapping improves computational efficiency in reconciling heterogeneous corpuses by reducing reconstruction error and preserving topological representations across domains. 2 . The method of claim 1 , further comprising: computing, by the computing device, a corresponding reconstruction loss for each of the two or more encoder-decoder models using the respective prediction and the input data from each of the two or more domain corpuses of dissimilar data; and extracting, by the computing device, a low-dimensional embedding vector of input data representations from each of the two or more encoder-decoder model. 3 . The method of claim 2 , wherein the distribution distance metric is a pairwise mean relative living times (MRLT) distribution distance metric, and the function is a joint loss function. 4 . The method of claim 3 , further comprising: computing, by the computing device, a gradient of a loss from the joint loss function with respect to model parameters for each of the two or more encoder-decoder models. 5 . The method in claim 4 , further comprising: initializing, by the computing device, weights for each of the two or more encoder-decoder models; performing, by the computing device, preprocessing, transforming and extracting of the input data into a fixed-dimension feature vector; performing, by the computing device, feed forward processing for a feedforward pass for each intra-domain sample of the input data into each respective one of the two or more encoder-decoder models; generating, by the computing device, corresponding output predictions for each of the intra-domain samples of the input data using each respective one of the two or more encoder-decoder models; and computing, by the computing device, a corresponding loss value with respect to the joint loss function for each of the two or more encoder-decoder models given the intra-domain samples of the input data and the corresponding output predictions. 6 . The method in claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a distance of a distribution the two relative living time metrics defined between the first RLT matrix and the second RLT matrix. 7 . The method in claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a squared loss function between output of the first RLT matrix and the second RLT matrix. 8 . The method of claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a Wasserstein distance determination of a distribution the first RLT matrix and the second RLT matrix. 9 . The method of claim 1 , wherein the two or more domain corpuses of dissimilar data comprise text, images, audio, and other data sources in different domains of knowledge. 10 . A computer program product for interrelating two or more domain corpuses of dissimilar data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: Receive, by the processor, input data from each of two or more domain corpuses of dissimilar data; compute, by the processor, a pass for each of the input data; train, based on the pass for each of the input data, two or more encoder-decoder models selected from convolutional neural networks, recurrent neural networks, long short-term memory networks, self-attention networks; obtain, by the processor, a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models; compute, by the processor, a pairwise mean relative living time (MRLT) distribution distance metric based on persistence intervals of simplicial complexes derived from low-dimensional embedding vector representations from each of the two or more encoder models; compute, by the processor, a joint loss function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics; update, by the processor based on results of the joint loss function, the training of the two or more encoder-decoder models, wherein the training comprises using backpropagation to minimize a reconstruction loss; determine, using an epsilon parameter, a ball radius to determine a representation of the simplicial complexes of an intra-domain relationship to perform mappings across inter-domain relationships; learn, by the two or more encoder-decoder models, how to reconstruct the data from encoded representations based on a number of holes and connected components relative to the ball radius; and determine, using the updated two or more encoder-decoder models and the ball radius, a cross-domain structural mapping interrelating dissimilar data of the two or more encoder-decoder models, wherein the cross-domain structural mapping improves computational efficiency in reconciling heterogeneous corpuses by reducing reconstruction error and preserving topological representations across domains. 11 . The computer program product of claim 10 , wherein: the program instructions executable by the processor further ca

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Knowledge-based neural networks; Logical representations of neural networks · CPC title

  • Learning methods · CPC title

  • Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12579111B2 cover?
A method of using a computing device executing to interrelate two or more corpuses of dissimilar data that includes receiving input data from each of two or more corpuses of dissimilar data. The computing device computes a pass for each of the input data into two or more encoder-decoder models. The computing device further obtains a prediction of an identity mapping for each of different domain…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).