Method and apparatus for training text classification model
US-2023016365-A1 · Jan 19, 2023 · US
US12079580B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12079580-B2 |
| Application number | US-202117348306-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 15, 2021 |
| Priority date | Nov 30, 2020 |
| Publication date | Sep 3, 2024 |
| Grant date | Sep 3, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An information extraction method, an extraction model training method, an apparatus and an electronic device all relate to knowledge graphs. A specific implementation includes acquiring an input text and determining a semantic vector of the input text according to the input text. Such implementation also includes inputting the semantic vector of the input text to a pre-acquired extraction model to obtain a first enhanced text of the input text. The first enhanced text is a text with a text score greater than a preset threshold output by the extraction model. The extraction model performs text extraction based on the semantic vector of the input text. Since the semantic vector has rich context semantics, the enhanced text extracted by the extraction model can be more in line with the context of the input text.
Opening claim text (preview).
What is claimed is: 1. An information extraction method, comprising: acquiring an input text; determining a semantic vector of the input text according to the input text; and inputting the semantic vector of the input text to a pre-acquired extraction model to obtain a first enhanced text of the input text; wherein the method further comprises: after inputting the semantic vector of the input text to the pre-acquired extraction model to obtain the first enhanced text of the input text; performing boundary correction on the first enhanced text according to the input text to obtain a target enhanced text; wherein determining the semantic vector of the input text according to the input text comprises: performing identification conversion on each word in the input text to obtain an identification sequence of identifications which correspond to words in a one-to-one manner; and inputting the identification sequence to a bidirectional encoder representation model from transformers to obtain the semantic vector of the input text. 2. The method according to claim 1 , wherein performing boundary correction on the first enhanced text according to the input text to obtain the target enhanced text comprises: performing word segmentation on the input text to obtain a word segmentation result; and performing boundary correction on first and last positions of the first enhanced text according to the word segmentation result to obtain the target enhanced text. 3. The method according to claim 2 , wherein performing boundary correction on the first and last positions of the first enhanced text according to the word segmentation result to obtain the target enhanced text comprises: in a case that the first or last position of the first enhanced text does not match the word segmentation result, supplementing the first or last position of the first enhanced text according to the word segmentation result to obtain the target enhanced text. 4. An electronic device, comprising at least one processor; and a memory communicatively connected with the at least one processor; wherein the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to implement: acquiring an input text; determining a semantic vector of the input text according to the input text; inputting the semantic vector of the input text to a pre-acquired extraction model to obtain a first enhanced text of the input text; wherein the instructions are executed by the at least one processor to cause the at least one processor to implement: performing boundary correction on the first enhanced text according to the input text to obtain a target enhanced text; wherein the instructions are executed by the at least one processor to cause the at least one processor to implement: performing identification conversion on each word in the input text to obtain an identification sequence of identifications which correspond to words in a one-to-one manner; and inputting the identification sequence to a bidirectional encoder representation model from transformers to obtain the semantic vector of the input text. 5. The electronic device according to claim 4 , wherein the instructions are executed by the at least one processor to cause the at least one processor to implement: performing word segmentation on the input text to obtain a word segmentation result; and performing boundary correction on first and last positions of the first enhanced text according to the word segmentation result to obtain the target enhanced text. 6. The electronic device according to claim 5 , wherein, the instructions are executed by the at least one processor to cause the at least one processor to implement: in a case that the first or last position of the first enhanced text does not match the word segmentation result, supplementing the first or last position of the first enhanced text according to the word segmentation result to obtain the target enhanced text. 7. A non-transitory computer readable storage medium storing computer instructions, wherein the instructions are configured to cause a computer to implement the method according to claim 1 .
Supervised learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Partitioning the feature space · CPC title
Matching criteria, e.g. proximity measures · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.