What technology area does this patent fall under?

Primary CPC classification G06F40/295. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Normalized processing method and apparatus of named entity, and electronic device

US11989518B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11989518-B2
Application number	US-202117506726-A
Country	US
Kind code	B2
Filing date	Oct 21, 2021
Priority date	Oct 22, 2020
Publication date	May 21, 2024
Grant date	May 21, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A normalized processing method of a named entity includes: obtaining first text data; recognizing a named entity from the first text data; determining whether a first standard named entity exists in a standard named entity database according to the named entity; determining the first standard named entity as a normalized representation of the named entity in response to determining that the first standard named entity exists in the standard named entity database; and obtaining a second standard named entity from the standard named entity database and determining an obtained second standard named entity as the normalized representation of the named entity in response to determining that the first standard named entity does not exist in the standard named entity database.

First claim

Opening claim text (preview).

What is claimed is: 1. A normalized processing method of a named entity, comprising: obtaining first text data; recognizing a named entity from the first text data; determining whether a first standard named entity exists in a standard named entity database according to the named entity, the first standard named entity being a standard named entity whose character string matches a character string of one of the named entity and an extended named entity, and the extended named entity being obtained by performing a synonym substitution on at least part of words of the named entity; determining the first standard named entity as a normalized representation of the named entity in response to determining that the first standard named entity exists in the standard named entity database; and obtaining a second standard named entity from the standard named entity database, and determining an obtained second standard named entity as the normalized representation of the named entity in response to determining that the first standard named entity does not exist in the standard named entity database, the second standard named entity being a standard named entity whose word vector similarity to the named entity in the standard named entity database satisfies a preset condition, wherein obtaining the second standard named entity from the standard named entity database, includes: determining a word vector similarity between each standard named entity in the standard named entity database and the named entity based on a word vector similarity matching algorithm; and determining the standard named entity whose word vector similarity to the named entity in the standard named entity database satisfies the preset condition as the second standard named entity, wherein determining the word vector similarity between each standard named entity in the standard named entity database and the named entity based on the word vector similarity matching algorithm, includes: calculating a length of a longest common subsequence of the named entity and each standard named entity in the standard named entity database; sequencing standard named entities in the standard named entity database to obtain a standard named entity candidate list according to lengths of the longest common subsequences; and sequentially inputting each standard named entity in the standard named entity candidate list and the named entity into a semantic model based on a word vector, so as to obtain the word vector similarity between the named entity and the standard named entity, wherein the semantic model based on the word vector includes a bi-directional encoder representation from transformers (BERT) model; and a fully connected layer of the BERT model is implemented by using a softmax classifier or a sigmoid classifier. 2. The normalized processing method according to claim 1 , wherein recognizing the named entity from the first text data, includes: deleting a first text in the first text data to obtain second text data, the first text including at least one stop word and/or at least one designated symbol; and recognizing the named entity from the second text data. 3. The normalized processing method according to claim 2 , wherein the second text data is a long text, and recognizing the named entity from the second text data, includes: using a first named entity recognition algorithm to recognize the named entity from the second text data, the first named entity recognition algorithm being a named entity recognition algorithm for the long text. 4. The normalized processing method according to claim 3 , wherein before recognizing the named entity from the second text data, recognizing the named entity from the first text data, further includes: determining whether a text length of the second text data is greater than a preset text length threshold; using the second text data as the long text in response to determining that the text length of the second text data is greater than the preset text length threshold. 5. The normalized processing method according to claim 3 , wherein the first named entity recognition algorithm includes a named entity recognition algorithm based on a bi-directional long-short term memory network (BiLSTM) and a conditional random field (CRF). 6. The normalized processing method according to claim 2 , wherein the second text data is a short text, and recognizing the named entity from the second text data, includes: using a second named entity recognition algorithm to recognize the named entity from the second text data, the second named entity recognition algorithm being a named entity recognition algorithm for the short text. 7. The normalized processing method according to claim 6 , wherein before recognizing the named entity from the second text data, recognizing the named entity from the first text data, further includes: determining whether a text length of the second text data is greater than a preset text length threshold; using the second text data as the short text in response to determining that the text length of the second text data is less than or equal to the preset text length threshold. 8. The normalized processing method according to claim 6 , wherein the second named entity recognition algorithm includes a named entity recognition algorithm based on a regular expression. 9. The normalized processing method according to claim 1 , wherein determining whether the first standard named entity exists in the standard named entity database according to the named entity, includes: searching for the standard named entity whose character string matches the character string of the named entity in the standard named entity database; and searching for the standard named entity whose character string matches the character string of the extended named entity in the standard named entity database in response to determining that the standard named entity whose character string matches the character string of the named entity is not found, wherein the found standard named entity whose character string matches the character string of the named entity or the extended named entity is used as the first standard named entity. 10. The normalized processing method according to claim 9 , wherein the extended named entity is obtained by performing a complete synonym substitution on the named entity, and the complete synonym substitution is a synonym substitution on the named entity as a whole. 11. The normalized processing method according to claim 9 , wherein the extended named entity is obtained by performing a partial synonym substitution on the named entity, and the partial synonym substitution is a synonym substitution on at least one named entity word segmentation obtained by performing a word segmentation processing on the named entity. 12. The normalized processing method according to claim 11 , wherein performing the partial synonym substitution on the named entity, includes: performing the word segmentation processing on the named entity to obtain a plurality of named entity word segmentations; and traversing a partial synonym mapping table according to the plurality of named entity word segmentations, and substituting at least one traversed named entity word segmentation for a synonym to obtain the extended named entity. 13. The normalized processing method according to claim 1 , wherein the preset condition is that the word vector similarity between the named entity and the standard named entity reaches a preset similarity threshold, or the preset condition is that the named entity and one standard named entity in the standard named entity database have a highest wo

Assignees

Boe Technology Group Co Ltd

Inventors

Classifications

G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title
G06F40/295Primary
Named entity recognition · CPC title
G06F16/3344
using natural language analysis · CPC title
G06F40/247
Thesauruses; Synonyms · CPC title

Patent family

Related publications grouped by family.

View patent family 74264639

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11989518B2 cover?: A normalized processing method of a named entity includes: obtaining first text data; recognizing a named entity from the first text data; determining whether a first standard named entity exists in a standard named entity database according to the named entity; determining the first standard named entity as a normalized representation of the named entity in response to determining that the fir…
Who is the assignee on this patent?: Boe Technology Group Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Gazetteer integration for neural named entity recognition

Architecture for gazetteer-augmented named entity recognition

System and method for recognizing domain specific named entities using domain specific word embeddings

Named entity normalization in a spoken dialog system

Identifying entities in a digital work

Identifying entity synonyms

Electronic device and method for recognizing named entities in electronic device

Frequently asked questions