Automatic recognition of entities related to cloud incidents
US-2022012633-A1 · Jan 13, 2022 · US
US11748567B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11748567-B2 |
| Application number | US-202016926525-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 10, 2020 |
| Priority date | Jul 10, 2020 |
| Publication date | Sep 5, 2023 |
| Grant date | Sep 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described herein are embodiments of a framework named as total correlation variational autoencoder (TC_VAE) to disentangle syntax and semantics by making use of total correlation penalties of KL divergences. One or more Kullback-Leibler (KL) divergence terms in a loss for a variational autoencoder are discomposed so that generated hidden variables may be separated. Embodiments of the TC_VAE framework were examined on semantic similarity tasks and syntactic similarity tasks. Experimental results show that better disentanglement between syntactic and semantic representations have been achieved compared with state-of-the-art (SOTA) results on the same data sets in similar settings.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for segmenting latent representations comprising: given a variational autoencoder (VAE) model comprising an attention layer, a semantic encoder, a syntax encoder, and a decoder: generating, using an embedding layer, a sequence of embeddings for a sequence of tokens, in which the tokens are words or representations of words; generating, using the attention layer, a sequence of attention masks based on the sequence of embeddings; generating a sequence of hidden variables based on the sequence of embeddings and the sequence of attention masks; generating, using the semantic encoder, a first sequence of latent variables based on the sequence of hidden variables; generating, using the syntax encoder, a second sequence of latent variables based on the sequence of hidden variables; inferring, using the decoder, a sequence of reconstructed tokens and a sequence of reconstructed attention masks using at least information of the first sequence of latent variables from the semantic encoder and the second sequence of latent variables from the syntax encoder; and using the sequence of reconstructed tokens and the sequence of reconstructed attention masks to train at least the attention layer, the semantic encoder, the syntax encoder, and the decoder of the VAE model. 2. The computer-implemented method of claim 1 wherein each hidden variable in the sequence of hidden variables is generated by an element-wise multiplication between an embedding of the sequence of embeddings and a corresponding attention mask or masks of the sequence of attention masks. 3. The computer-implemented method of claim 1 wherein inferring a sequence of reconstructed tokens and a sequence of reconstructed attention masks based on at least information of at least the first and second sequences of latent variables comprising: combining a sequence of global latent variables with the first sequence of latent variables and the second sequence of latent variables to generate a first sequence of combined latent variables and a second sequence of combined latent variables respectively; receiving, at the decoder, the first sequence of combined latent variables and the second sequence of combined latent variables; and inferring the sequence of reconstructed tokens and the sequence of reconstructed attention masks. 4. The computer-implemented method of claim 1 further comprising: using the first sequence of latent variables, the second sequence of latent variables, or both to gauge semantics similarity of input words or representations of words, to gauge syntactic similarity of input words or representations of words, or for a natural language processing method. 5. The computer-implemented method of claim 4 wherein the step of using the sequence of reconstructed tokens and the sequence of reconstructed attention masks to train at least the attention layer, the semantic encoder, the syntax encoder, and the decoder comprises determining a loss comprises one or more total correlation (TC) terms to enforce disentanglement of latent variables. 6. The computer-implemented method of claim 5 wherein the one or more TC terms comprise a first Kullback-Leibler (KL) divergence for the semantic encoder and a second KL divergence for the syntax encoder. 7. The computer-implemented method of claim 6 wherein the first KL divergence is a KL divergence between a distribution of a first sequence of combined latent variables and a product of a factorial distribution for each latent variable in the first sequence of latent variables and a factorial distribution for each global latent variable in the first sequence of combined latent variables, and the second KL divergence is a KL divergence between a distribution of a second sequence of combined latent variables and a product of a factorial distribution for each latent variable in the second sequence of latent variables and a factorial distribution for each global latent variable in the second sequence of combined latent variables. 8. The computer-implemented method of claim 1 further comprising: for an input word or an input representation of a word that was input into a system comprising at least a trained attention layer, a trained semantic encoder, and a trained syntax encoder, using a combination of its corresponding first latent variable from the trained semantic encoder and its corresponding second latent variable from the trained syntax encoder. 9. A system for segmenting latent representations comprising: one or more processors; and a non-transitory computer-readable medium or media comprising one or more sets of instructions which, when executed by at least one of the one or more processors, causes steps to be performed comprising: given a variational autoencoder (VAE) model comprising a semantic encoder, a syntax encoder, and a decoder: generating a sequence of embeddings for a sequence of tokens, in which a token is a word or a representation of a word; generating a sequence of attention masks based on the sequence of embeddings; generating a sequence of hidden variables based on the sequence of embeddings with the sequence of attention masks; generating, using the semantic encoder, a first sequence of latent variables based on the sequence of hidden variables; generating, using the syntax encoder, a second sequence of latent variables based on the sequence of hidden variables; inferring, using the decoder, a sequence of reconstructed tokens and a sequence of reconstructed attention masks using at least information of the first sequence of latent variables from the semantic encoder and the second sequence of latent variables from the syntax encoder; and using the sequence of reconstructed tokens and the sequence of reconstructed attention masks to train at least the semantic encoder, the syntax encoder, and the decoder of the VAE model. 10. The system of claim 9 wherein each hidden variable in the sequence of hidden variables is generated by an element-wise multiplication between an embedding of the sequence of embeddings and a corresponding attention mask or masks of the sequence of attention masks. 11. The system of claim 9 wherein inferring a sequence of reconstructed tokens and a sequence of reconstructed attention masks based on at least information of at least the first and second sequences of latent variables comprises steps of: combining a sequence of global latent variables with the first sequence of latent variables and the second sequence of latent variables to generate a first sequence of combined latent variables and a second sequence of combined latent variables respectively; receiving, at the decoder, the first sequence of combined latent variables and the second sequence of combined latent variables; and inferring the sequence of reconstructed tokens and the sequence of reconstructed attention masks. 12. The system of claim 9 wherein the non-transitory computer-readable medium or media further comprises one or more sets of instructions which, when executed by at least one of the one or more processors, causes steps to be performed comprising: using the first sequence of latent variables, the second sequence of latent variables, or both to gauge semantics similarity of input words or representations of words, to gauge syntactic similarity of input words or representations of words, or for a natural language processing method. 13. The system of claim 9 wherein the step of using the sequence of reconstructed tokens and the sequence of reconstructed attention masks to train at least an attention layer, the semantic encoder, the syntax encoder, and the decoder
Auto-encoder networks; Encoder-decoder networks · CPC title
Generative networks · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Lexical analysis, e.g. tokenisation or collocates · CPC title
Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.