Neural paraphrase generator
US-2018329883-A1 · Nov 15, 2018 · US
US11514247B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11514247-B2 |
| Application number | US-201916713062-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2019 |
| Priority date | May 31, 2019 |
| Publication date | Nov 29, 2022 |
| Grant date | Nov 29, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, an apparatus, a computer device and a readable medium for knowledge hierarchical extraction of a text are disclosed. The method comprises: performing word segmentation on a designated text to obtain a word list, the word list including at least one word arranged in a sequence in the designated text; analyzing part-of-speech of each word in the word list in the designated text, to obtain a part-of-speech list corresponding to the word list; predicting a SPO triple included in the designated text according to the word list, the part-of-speech list and a pre-trained knowledge hierarchical extraction model. By the technical solutions, the SPO triple included in any designated text however loose its organization and structure is may be accurately extracted based on the pre-trained knowledge hierarchical extraction model. Compared to the prior art, the efficiency and accuracy of knowledge hierarchical extraction may be effectively improved.
Opening claim text (preview).
What is claimed is: 1. A method for knowledge hierarchical extraction of a text, comprising: performing word segmentation on a designated text to obtain a word list, the word list including at least one word arranged in a sequence in the designated text; analyzing part-of-speech of each word in the word list in the designated text, to obtain a part-of-speech list corresponding to the word list; predicting a SPO triple included in the designated text according to the word list, the part-of-speech list and a pre-trained knowledge hierarchical extraction model, comprising: inputting the word list and the part-of-speech list into the knowledge hierarchical extraction model; obtaining, by an embedded layer, a word embedding expression based on the word list and a pre-trained word vector list obtaining a part-of-speech embedding expression based on the part-of-speech list and a pre-trained part-of-speech vector list obtaining, by a pre-trained Stacked Recurrent Neural Network layer, a bottom layer embedding expression which is of the designated text and carries context information, based on the word embedding expression and the part-of-speech embedding expression; and through two pre-trained fully-connected layers in turn, predicting a prediction relationship which is included in the designated text and whose prediction probability is greater than a preset probability threshold; further inputting the bottom layer embedding expression, the prediction probability of the prediction relationship and a feature expression corresponding to the prediction relationship into a pre-trained conditional random field network layer for sequence marking, so as to obtain an subject and an object corresponding to the prediction relationship; and outputting the SPO triple consisting of the subject, the object and the prediction relationship. 2. The method according to claim 1 , further comprising: judging, according to a preset parameter set, whether the SPO triple predicted complies with a SPO triple structure preset in the parameter set, the parameter set comprising at least one preset SPO triple structure, each preset SPO triple structure comprising content of a relationship, and types of a subject and an object; if the SPO triple predicted complies with a SPO triple structure preset in the parameter set, determining that the SPO triple predicted is a target SPO triple of the designated text; otherwise, if the SPO triple predicted does not comply with a SPO triple structure preset in the parameter set, deleting the SPO triple. 3. The method according to claim 1 , further comprising: before predicting the SPO triple included in the designated text according to the word list, the part-of-speech list and the pre-trained knowledge hierarchical extraction model, collecting a plurality of training texts and a known SPO triple included in each training text; training the knowledge hierarchical extraction model with the plurality of training texts and the known SPO triple included in each training text. 4. The method according to claim 3 , wherein the training the knowledge hierarchical extraction model with the plurality of training texts and the known SPO triple included in each training texts comprises: performing word segmentation for each training text to obtain a training word list; the training word list including at least one training word arranged in a sequence in the training text; analyzing a part-of-speech of each training word in the training word list in the training text to obtain a training part-of-speech list corresponding to the training word list; training the knowledge hierarchical extraction model according to the training word list and the training part-of-speech list of each training text and known SPO triple in each training text. 5. The method according to claim 4 , wherein the training the knowledge hierarchical extraction model according to the training word list and the training part-of-speech list of each training text and the known SPO triple in each training text comprises: initializing the word vector list, the part-of-speech vector list, parameters of the Stacked Recurrent Neural Network layer, parameters of the fully-connected layers and parameters of the conditional random field network layer in the knowledge hierarchical extraction model; inputting the training word list, the training part-of-speech list and the known SPO triple of each training text into the knowledge hierarchical extraction model, to obtain a predicted SPO triple output by the knowledge hierarchical extraction model; calculating a value of a loss function according to the known SPO triple and the predicted SPO triple; judging whether the value of the loss function is greater than or equal to a preset threshold; if the value of the loss function is greater than or equal to a preset threshold, adjusting the word vector list, the part-of-speech vector list, the parameters of the Stacked Recurrent Neural Network layer, the parameters of the fully-connected layers and the parameters of the conditional random field network layer in the knowledge hierarchical extraction model to make the value of the loss function smaller than the preset threshold; repeating the above steps, and constantly training the knowledge hierarchical extraction model with the training word list, the training part-of-speech list and the known SPO triple of each of the plurality of training texts in the above manner; if training times reach a preset training time threshold, or the value of the loss function is always smaller than a preset threshold within a range of consecutive preset times, determining the word vector list, the part-of-speech vector list, the parameters of the Stacked Recurrent Neural Network layer, the parameters of the fully-connected layers and the parameters of the conditional random field network layer in the knowledge hierarchical extraction model, and thereby determining the knowledge hierarchical extraction model. 6. The method according to claim 1 , wherein the pre-trained Stacked Recurrent Neural Network layer includes a plurality of LSTM units, allowing each layer of LSTM units to learn an output sequence of a previous layer in an alternate forward and backward sequence respectively. 7. A computer device, comprising: one or more processors, a memory for storing one or more programs, the one or more programs, when executed by said one or more processors, enable said one or more processors to implement a method for knowledge hierarchical extraction of a text, which comprises: performing word segmentation on a designated text to obtain a word list, the word list including at least one word arranged in a sequence in the designated text; analyzing part-of-speech of each word in the word list in the designated text, to obtain a part-of-speech list corresponding to the word list; predicting a SPO triple included in the designated text according to the word list, the part-of-speech list and a pre-trained knowledge hierarchical extraction model, comprising: inputting the word list and the part-of-speech list into the knowledge hierarchical extraction model; obtaining, by an embedded layer, a word embedding expression based on the word list and a pre-trained word vector list obtaining a part-of-speech embedding expression based on the part-of-speech list and a pre-trained part-of-speech vector list obtaining, by a pre-trained Stacked Recurrent Neural Network layer, a bottom layer embedding expression which is of the designated text and carries context information, based on the word embedding expression and the part-of-speech embedding expression; and through two pre-trained fully-connected layers in turn, predicting a prediction relationship which is included in the designated text and whose predict
Parsing · CPC title
Semantic analysis · CPC title
based on specific statistical tests · CPC title
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.