Generating cross-domain data using variational mapping between embedding spaces
US-2019318040-A1 · Oct 17, 2019 · US
US12579111B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12579111-B2 |
| Application number | US-202017139190-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2020 |
| Priority date | Dec 31, 2020 |
| Publication date | Mar 17, 2026 |
| Grant date | Mar 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of using a computing device executing to interrelate two or more corpuses of dissimilar data that includes receiving input data from each of two or more corpuses of dissimilar data. The computing device computes a pass for each of the input data into two or more encoder-decoder models. The computing device further obtains a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models. The computing device additionally computes a distribution distance metric as an output from each of a low-dimensional embedding vector representation from each of the two or more encoder-decoder models. The computing device still further computes a function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics. The computing device additionally updates the two or more encoder-decoder models.
Opening claim text (preview).
What is claimed is: 1 . A method of using a computing device executing to interrelate two or more domain corpuses of dissimilar data, the method comprising: receiving input data from each of two or more domain corpuses of dissimilar data; computing, by the computing device, a pass for each of the input data; training, based on the pass for each of the input data, two or more encoder-decoder models selected from convolutional neural networks, recurrent neural networks, long short-term memory networks, self-attention networks; obtaining, by the computing device, a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models; computing, but the computing device, a pairwise mean relative living time (MRLT) distribution distance metric based on persistence intervals of simplicial complexes derived from low-dimensional embedding vector representations from each of the two or more encoder models; computing, by the computing device, a joint loss function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics; updating, by the computing device, based on results of the joint loss function, the training of the two or more encoder-decoder models, wherein the training comprises using backpropagation to minimize a reconstruction loss; determining, using an epsilon parameter, a ball radius to determine a representation of the simplicial complexes of an intra-domain relationship to perform mappings across inter-domain relationships; learning, by the two or more encoder-decoder models, how to reconstruct the data from encoded representations based on a number of holes and connected components relative to the ball radius; and determining, using the updated two or more encoder-decoder models and the ball radius, a cross-domain structural mapping interrelating dissimilar data of the two or more encoder-decoder models, wherein the cross-domain structural mapping improves computational efficiency in reconciling heterogeneous corpuses by reducing reconstruction error and preserving topological representations across domains. 2 . The method of claim 1 , further comprising: computing, by the computing device, a corresponding reconstruction loss for each of the two or more encoder-decoder models using the respective prediction and the input data from each of the two or more domain corpuses of dissimilar data; and extracting, by the computing device, a low-dimensional embedding vector of input data representations from each of the two or more encoder-decoder model. 3 . The method of claim 2 , wherein the distribution distance metric is a pairwise mean relative living times (MRLT) distribution distance metric, and the function is a joint loss function. 4 . The method of claim 3 , further comprising: computing, by the computing device, a gradient of a loss from the joint loss function with respect to model parameters for each of the two or more encoder-decoder models. 5 . The method in claim 4 , further comprising: initializing, by the computing device, weights for each of the two or more encoder-decoder models; performing, by the computing device, preprocessing, transforming and extracting of the input data into a fixed-dimension feature vector; performing, by the computing device, feed forward processing for a feedforward pass for each intra-domain sample of the input data into each respective one of the two or more encoder-decoder models; generating, by the computing device, corresponding output predictions for each of the intra-domain samples of the input data using each respective one of the two or more encoder-decoder models; and computing, by the computing device, a corresponding loss value with respect to the joint loss function for each of the two or more encoder-decoder models given the intra-domain samples of the input data and the corresponding output predictions. 6 . The method in claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a distance of a distribution the two relative living time metrics defined between the first RLT matrix and the second RLT matrix. 7 . The method in claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a squared loss function between output of the first RLT matrix and the second RLT matrix. 8 . The method of claim 4 , further comprising: computing, by the computing device, the pairwise MRLT distribution distance metrics based on a first relative living times (RLT) matrix and a second RLT matrix between each intra-domain samples of the input data and based on using a Wasserstein distance determination of a distribution the first RLT matrix and the second RLT matrix. 9 . The method of claim 1 , wherein the two or more domain corpuses of dissimilar data comprise text, images, audio, and other data sources in different domains of knowledge. 10 . A computer program product for interrelating two or more domain corpuses of dissimilar data, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: Receive, by the processor, input data from each of two or more domain corpuses of dissimilar data; compute, by the processor, a pass for each of the input data; train, based on the pass for each of the input data, two or more encoder-decoder models selected from convolutional neural networks, recurrent neural networks, long short-term memory networks, self-attention networks; obtain, by the processor, a prediction of an identity mapping for each of different domains of knowledge from each of the two or more encoder-decoder models; compute, by the processor, a pairwise mean relative living time (MRLT) distribution distance metric based on persistence intervals of simplicial complexes derived from low-dimensional embedding vector representations from each of the two or more encoder models; compute, by the processor, a joint loss function based on each of the predictions from each of the two or more encoder-decoder models and the distribution distance metrics; update, by the processor based on results of the joint loss function, the training of the two or more encoder-decoder models, wherein the training comprises using backpropagation to minimize a reconstruction loss; determine, using an epsilon parameter, a ball radius to determine a representation of the simplicial complexes of an intra-domain relationship to perform mappings across inter-domain relationships; learn, by the two or more encoder-decoder models, how to reconstruct the data from encoded representations based on a number of holes and connected components relative to the ball radius; and determine, using the updated two or more encoder-decoder models and the ball radius, a cross-domain structural mapping interrelating dissimilar data of the two or more encoder-decoder models, wherein the cross-domain structural mapping improves computational efficiency in reconciling heterogeneous corpuses by reducing reconstruction error and preserving topological representations across domains. 11 . The computer program product of claim 10 , wherein: the program instructions executable by the processor further ca
Combinations of networks · CPC title
Knowledge-based neural networks; Logical representations of neural networks · CPC title
Learning methods · CPC title
Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.