What technology area does this patent fall under?

Primary CPC classification G06F40/30. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Providing a semantic encoding and language neural network

US12217007B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12217007-B2
Application number	US-202217811763-A
Country	US
Kind code	B2
Filing date	Jul 11, 2022
Priority date	Jul 11, 2022
Publication date	Feb 4, 2025
Grant date	Feb 4, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments are provided for unsupervised learning of domain specific knowledge graph from textual data and language generation from knowledge graph via reinforcement learning in a computing system by a processor. Unstructured data is automatically parsed into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model. Text data is generated from the one or more knowledge graphs using a second machine learning model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for providing semantic encoding and language generation in a computing system by a processor, comprising: automatically parsing unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encoding, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sampling, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generating text data from the set of the plurality of triples using the second machine learning model; computing a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjusting at least one predicted probability from the first machine learning model based on the determined penalty score. 2. The method of claim 1 , further including training the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder. 3. The method of claim 1 , further including using the first machine learning model to: identify the entities in the unstructured data. 4. The method of claim 1 , further including using the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation. 5. The method of claim 1 , further including sampling the set of the plurality of triples from the unstructured data of the one or more knowledge graphs for training a plurality of machine learning models via unsupervised machine learning. 6. The method of claim 1 , further including: identifying one or more candidate entities in the unstructured data; and using the one or more candidate entities as nodes in the one or more knowledge graphs. 7. A system for providing semantic encoding and language generation in a computing environment, comprising: one or more computers with executable instructions that when executed cause the system to: automatically parse unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encode, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sample, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generate text data from the set of the plurality of triples using the second machine learning model; compute a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjust at least one predicted probability from the first machine learning model based on the determined penalty score. 8. The system of claim 7 , wherein the executable instructions when executed cause the system to train the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder. 9. The system of claim 7 , wherein the executable instructions when executed cause the system to use the first machine learning model to: identify the entities in the unstructured data. 10. The system of claim 7 , wherein the executable instructions when executed cause the system to use the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation. 11. The system of claim 7 , wherein the executable instructions when executed cause the system to sample the set of the plurality of triples from the unstructured data of the one or more knowledge graphs for training a plurality of machine learning models via unsupervised machine learning. 12. The system of claim 7 , wherein the executable instructions when executed cause the system to: identify one or more candidate entities in the unstructured data; and use the one or more candidate entities as nodes in the one or more knowledge graphs. 13. A computer program product for providing semantic encoding and language generation in a computing environment, the computer program product comprising: one or more tangible computer readable storage media, and program instructions collectively stored on the one or more tangible computer readable storage media, the program instruction comprising: automatically parse unstructured data into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model; encode, using the first machine learning model, the unstructured data into a distribution of a plurality of triples based on the one or more knowledge graphs, wherein the encoding further comprises predicted probabilities of relations between entities in the unstructured data; sample, using a second machine learning model, a set of the plurality of triples from the unstructured data of the one or more knowledge graphs; generate text data from the set of the plurality of triples using the second machine learning model; compute a penalty score for the set of the plurality of triples based on a degree of difference between the unstructured data and the generated text data; and adjust at least one predicted probability from the first machine learning model based on the determined penalty score. 14. The computer program product of claim 13 , further including program instructions to train the first machine learning model and the second machine learning model using the unstructured data and the list of candidate relations via unsupervised machine learning, wherein the first machine learning model is a semantic encoder and the second machine learning model is a semantic decoder. 15. The computer program product of claim 13 , further including program instructions to use the first machine learning model to: identify the entities in the unstructured data. 16. The computer program product of claim 13 , further including program instructions to use the second machine learning model to: decode the set of the plurality of triples into the text data, wherein a triple includes a subject, object, and predicate in the unstructured data, wherein the subject and object are an entity and a predicate is a relation. 17. The computer program product of claim 13 , further including program instructions to: identify one or more candidate entities in the unstructured data; and use the one or more candidate entities as nodes in the one or more knowledge graphs.

Assignees

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06F40/205
Parsing · CPC title
G06F40/126
Character encoding · CPC title
G06F40/279
Recognition of textual entities · CPC title
G06F40/40
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

Patent family

Related publications grouped by family.

View patent family 89431593

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12217007B2 cover?: Embodiments are provided for unsupervised learning of domain specific knowledge graph from textual data and language generation from knowledge graph via reinforcement learning in a computing system by a processor. Unstructured data is automatically parsed into one or more knowledge graphs based on the unstructured data and a list of candidate relations using a first machine learning model. Text…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F40/30. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Supervised and unsupervised machine learning techniques for communication summarization

Automatic detection and association of new attributes with entities in knowledge bases

Method and apparatus for textual semantic encoding

Frequently asked questions