What technology area does this patent fall under?

Primary CPC classification G06F40/284. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Transformer-based encoding incorporating metadata

US11893346B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11893346-B2
Application number	US-202117308575-A
Country	US
Kind code	B2
Filing date	May 5, 2021
Priority date	May 5, 2021
Publication date	Feb 6, 2024
Grant date	Feb 6, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

From metadata of a corpus of natural language text documents, a relativity matrix is constructed, a row-column intersection in the relativity matrix corresponding to a relationship between two instances of a type of metadata. An encoder model is trained, generating a trained encoder model, to compute an embedding corresponding to a token of a natural language text document within the corpus and the relativity matrix, the encoder model comprising a first encoder layer, the first encoder layer comprising a token embedding portion, a relativity embedding portion, a token self-attention portion, a metadata self-attention portion, and a fusion portion, the training comprising adjusting a set of parameters of the encoder model.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: constructing, from metadata of a corpus of natural language text documents, a relativity matrix, a row-column intersection in the relativity matrix corresponding to a relationship between two instances of turn-based metadata of a conversation; and training an encoder model to compute an embedding corresponding to a token of a natural language text document within the corpus and the relativity matrix, the encoder model comprising a first encoder layer, the first encoder layer comprising a token embedding portion, a relativity embedding portion, a token self-attention portion, a metadata self-attention portion, and a fusion portion, the relativity embedding portion generating an input relativity embedding, the input relativity embedding encoding an entry in the relativity matrix, the metadata self-attention portion adjusting the input relativity embedding according to a set of metadata attention weights, the fusion portion combining an output of the token self-attention portion and an output of the metadata self-attention portion, the training comprising adjusting a set of parameters of the encoder model, the training generating a trained encoder model, wherein a parameter in the set of parameters of the encoder model is distinct from a layer in the encoder model, wherein the training comprises a training stage in which (i) a parameter of the token embedding portion and (ii) a parameter of the token self-attention portion are each held constant, and in which the training stage further changes (i) a parameter of the relativity embedding portion, and (ii) at least one parameter selected from a set of parameters comprising: the metadata self-attention portion, another attention portion, and the fusion portion. 2. The computer-implemented method of claim 1 , wherein the token embedding portion computes a set of token embeddings, a token embedding in the set of token embeddings corresponding to a token of a natural language text document within the corpus. 3. The computer-implemented method of claim 2 , wherein the token comprises a portion of a word of the natural language text document. 4. The computer-implemented method of claim 2 , wherein the token embedding comprises a multidimensional numerical representation of the token. 5. The computer-implemented method of claim 2 , wherein the token embedding comprises a combination of a multidimensional numerical representation of the token, a multidimensional numerical representation of a position of the token within the natural language text document, and a multidimensional numerical representation of a segment of the natural language text document in which the token is located. 6. The computer-implemented method of claim 1 , wherein the token self-attention portion adjusts an input token embedding according to a set of token attention weights, a token attention weight in the set of token attention weights corresponding to a relationship within the natural language text document between two tokens, the set of token attention weights computed during the training. 7. The computer-implemented method of claim 1 , wherein the set of metadata attention weights is computed during the training. 8. The computer-implemented method of claim 1 , wherein the training comprises: initializing a set of parameters of the token embedding portion to a base set of token embedding parameters; initializing a set of parameters of the token self-attention portion to a base set of token self-attention parameters; first training the encoder model, the first training comprising adjusting a set of parameters of the relativity embedding portion and a set of parameters of the metadata self-attention portion while the set of parameters of the token embedding portion is set to the base set of token embedding parameters and the set of parameters of the token self-attention portion is set to the base set of token self-attention parameters, the first training generating a partially trained encoder model; and second training the partially trained encoder model, the second training comprising adjusting the set of parameters of the partially trained encoder model, the second training generating the trained encoder model. 9. The computer-implemented method of claim 1 , wherein the encoder model further comprises a first decoder layer, the first decoder layer comprising a decoder token self-attention portion, a decoder metadata self-attention portion, a decoder fusion portion, and a decoder attention portion, the training comprising adjusting a set of parameters of the first decoder layer. 10. The computer-implemented method of claim 9 , wherein the decoder attention portion adjusts an output of an encoder layer according to a set of attention weights, the set of attention weights computed during the training. 11. The computer-implemented method of claim 1 , further comprising a second metadata self-attention portion adjusting a second input relativity embedding according to a second set of metadata attention weights, the second input relativity embedding comprising a multidimensional numerical representation of a row-column intersection in a second relativity matrix, the row-column intersection in the second relativity matrix corresponding to a relationship between two instances of a second type of metadata. 12. A computer program product for transformer-based natural language text autoencoding, the computer program product comprising: one or more computer readable storage media, and program instructions collectively stored on the one or more computer readable storage media, the program instructions comprising: program instructions to construct, from metadata of a corpus of natural language text documents, a relativity matrix, a row-column intersection in the relativity matrix corresponding to a relationship between two instances of turn-based metadata of a conversation; and program instructions to train an encoder model to compute an embedding corresponding to a token of a natural language text document within the corpus and the relativity matrix, the encoder model comprising a first encoder layer, the first encoder layer comprising a token embedding portion, a relativity embedding portion, a token self-attention portion, a metadata self-attention portion, and a fusion portion, the relativity embedding portion generating an input relativity embedding, the input relativity embedding encoding an entry in the relativity matrix, the metadata self-attention portion adjusting the input relativity embedding according to a set of metadata attention weights, the fusion portion combining an output of the token self-attention portion and an output of the metadata self-attention portion, the training comprising adjusting a set of parameters of the encoder model, the training generating a trained encoder model, wherein a parameter in the set of parameters of the encoder model is distinct from a layer in the encoder model, wherein the program instructions to train comprise program instructions to perform a training stage in which (i) a parameter of the token embedding portion and (ii) a parameter of the token self-attention portion are each held constant, and in which the training stage further changes (i) a parameter of the relativity embedding portion, and (ii) at least one parameter selected from a set of parameters comprising: the metadata self-attention portion, another attention portion, and the fusion portion. 13. The computer program product of claim 12 , wherein the token embedding portion computes a set of token embeddings, a token embedding in the set of token embeddings corresponding to a toke

Assignees

Inventors

Classifications

G06F40/284Primary
Lexical analysis, e.g. tokenisation or collocates · CPC title
G06F40/205
Parsing · CPC title
G06F40/237
Lexical tools · CPC title
G06F40/30
Semantic analysis · CPC title
G06F40/42
Data-driven translation · CPC title

Patent family

Related publications grouped by family.

View patent family 83855514

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11893346B2 cover?: From metadata of a corpus of natural language text documents, a relativity matrix is constructed, a row-column intersection in the relativity matrix corresponding to a relationship between two instances of a type of metadata. An encoder model is trained, generating a trained encoder model, to compute an embedding corresponding to a token of a natural language text document within the corpus and…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F40/284. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).