Systems and methods for unsupervised named entity recognition
US-2024242032-A1 · Jul 18, 2024 · US
US12450437B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12450437-B2 |
| Application number | US-202318326292-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 31, 2023 |
| Priority date | Jun 2, 2022 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and an apparatus for named entity recognition, and a non-transitory computer-readable recording medium are provided. In the method, text elements are traversed according to a text span to obtain candidate entity words. Then, a class to which the candidate entity word belongs is recognized. The recognizing of the class includes generating a prompt template corresponding to the candidate entity word, and concatenating the text to be recognized and the prompt template to obtain a concatenated text; generating vector representations of the text elements in the concatenated text; generating the vector representation of the candidate entity word according to the vector representations of the text elements of each candidate entity word in the concatenated text, and the vector representation of the text element of the mask word; and classifying the vector representation of the candidate entity word to obtain the class of the candidate entity word.
Opening claim text (preview).
What is claimed is: 1. A method for named entity recognition, the method comprising: traversing text elements in a text to be recognized according to an original text span to obtain a plurality of candidate entity words; and for each candidate entity word, recognizing a class to which the candidate entity word belongs, wherein the recognizing of the class includes generating a prompt template corresponding to the candidate entity word, and concatenating the text to be recognized and the prompt template to obtain a concatenated text, the prompt template being used to learn the class to which the candidate entity word belongs by prompt learning, and the prompt template including the candidate entity word, and an entity class replaced with a mask word, according to a span in the prompt template; generating vector representations of the text elements in the concatenated text; generating the vector representation of the candidate entity word according to the vector representations of the text elements of each candidate entity word in the concatenated text, and the vector representation of the text element of the mask word; and classifying the vector representation of the candidate entity word to obtain the class to which the candidate entity word belongs, wherein the generating of the vector representation of the candidate entity word includes performing a first integration process on the vector representations of the text elements of the candidate entity word in the text to be recognized to obtain a first span representation of the candidate entity word; performing the first integration process on the vector representations of the text elements of the candidate entity word in the prompt template to obtain a second span representation of the candidate entity word; and generating the vector representation of the candidate entity word according to the first span representation, the second span representation, and the vector representation of the text element of the mask word, wherein the generating of the vector representation of the candidate entity word further includes performing a second integration process on the first span representation and the second span representation to obtain a third span representation; and concatenating the third span representation and the vector representation of the text element of the mask word to obtain the vector representation of the candidate entity word, and wherein the original text span and the span in the prompt template are combined in the generating of the vector representation of the candidate entity word to obtain a final representation of the candidate entity word, and the final representation of the candidate entity word is provided to a neural network model. 2. The method for named entity recognition as claimed in claim 1 , wherein the generating of the vector representation of the candidate entity word includes obtaining a vector representation corresponding to a width value of a text span of the candidate entity word, and concatenating the third span representation and the vector representation corresponding to the width value of the text span of the candidate entity word to obtain a fourth span representation; and concatenating the fourth span representation and the vector representation of the text element of the mask word to obtain the vector representation of the candidate entity word. 3. The method for named entity recognition as claimed in claim 1 , wherein the concatenated text includes a start identifier, and the generating of the vector representation of the candidate entity word includes generating the vector representation of the candidate entity word according to the first span representation, the second span representation, the vector representation of the start identifier, and the vector representation of the text element of the mask word. 4. The method for named entity recognition as claimed in claim 3 , wherein the generating of the vector representation of the candidate entity word includes concatenating the third span representation, the vector representation of the start identifier, and the vector representation of the text element of the mask word to obtain the vector representation of the candidate entity word. 5. The method for named entity recognition as claimed in claim 3 , wherein the generating of the vector representation of the candidate entity word includes obtaining a vector representation corresponding to a width value of a text span of the candidate entity word, and concatenating the third span representation and the vector representation corresponding to the width value of the text span of the candidate entity word to obtain a fourth span representation; and concatenating the fourth span representation, the vector representation of the start identifier, and the vector representation of the text element of the mask word to obtain the vector representation of the candidate entity word. 6. The method for named entity recognition as claimed in claim 1 , wherein the first integration process includes any one of a max pooling process, an average pooling process, and concatenating of the vector representations of the first text element and the last text element in the candidate entity word, and the second integration process includes any one of a max pooling process and an average pooling process. 7. The method for named entity recognition as claimed in claim 1 , wherein the classifying of the vector representation of the candidate entity word includes inputting the vector representation of the candidate entity word into a softmax function to obtain at least one probability that the candidate entity word is mapped to different candidate classes, which is output by the softmax function; and selecting the candidate class with the highest probability serving as the class to which the candidate entity word belongs. 8. An apparatus for named entity recognition, the apparatus comprising: a memory storing computer-executable instructions; and one or more processors configured to execute the computer-executable instructions such that the one or more processors are configured to perform traversing text elements in a text to be recognized according to an original text span to obtain a plurality of candidate entity words; and for each candidate entity word, recognizing a class to which the candidate entity word belongs, wherein the recognizing of the class includes generating a prompt template corresponding to the candidate entity word, and concatenating the text to be recognized and the prompt template to obtain a concatenated text, the prompt template being used to learn the class to which the candidate entity word belongs by prompt learning, and the prompt template including the candidate entity word, and an entity class replaced with a mask word, according to a span in the prompt template; generating vector representations of the text elements in the concatenated text; generating the vector representation of the candidate entity word according to the vector representations of the text elements of each candidate entity word in the concatenated text, and the vector representation of the text element of the mask word; and classifying the vector representation of the candidate entity word to obtain the class to which the candidate entity word belongs, wherein the generating of the vector representation of the candidate entity word includes performing a first integration process on the vector representations of the text elements of the candidate entity word in the text to be recognized to obtain a first span representation of the candidate entity word; performing the first integration process on the vector representations of the text
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
Named entity recognition · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.