Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06F40/295. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method, electronic device, and storage medium for entity linking by determining a linking probability based on splicing of embedding vectors of a target and a reference text

US11704492B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11704492-B2
Application number	US-202117213927-A
Country	US
Kind code	B2
Filing date	Mar 26, 2021
Priority date	Apr 23, 2020
Publication date	Jul 18, 2023
Grant date	Jul 18, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, apparatus, device, and storage medium for entity linking is disclosed. The method includes: acquiring a target text; determining at least one entity mention included in the target text; determining a candidate entity corresponding to each of the entity mention based on a preset knowledge base; determining a reference text of each of the candidate entity and determining additional feature information of each of the candidate entity; and determining an entity linking result based on the target text, each of the reference text, and each piece of the additional feature information, wherein determining the entity linking result includes determining a probability of linking each of the candidate entity to the entity mention based on a splicing of a first embedding vector and a second embedding vector of the target text and a splicing of a first embedding vector and a second embedding vector of each respective reference text.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for entity linking, comprising: acquiring a target text; determining at least one entity mention included in the target text; determining a candidate entity corresponding to each of the entity mention based on a preset knowledge base; determining a reference text of each of the candidate entity and determining additional feature information of each of the candidate entity; and determining an entity linking result based on the target text, each of the reference text, and each piece of the additional feature information, wherein the determining the entity linking result based on the target text, each of the reference text, and each piece of the additional feature information comprises: determining a first embedding vector of the target text, a second embedding vector of the target text, a first embedding vector of each of the reference text, and a second embedding vector of each of the reference text respectively; splicing, for each reference text, the first embedding vector of the reference text, the second embedding vector of the reference text, and additional feature information of a candidate entity corresponding to the reference text, to obtain a first spliced vector; splicing the first embedding vector of the target text, the second embedding vector of the target text, and each of the first spliced vector, to obtain a second spliced vector; and determining a probability of linking each of the candidate entity to the entity mention based on each of the first spliced vector, the second spliced vector, and a preset classification model. 2. The method according to claim 1 , wherein the determining the at least one entity mention included in the target text comprises: determining a text embedding vector and a relevant eigenvector of the target text; fusing the text embedding vector and the relevant eigenvector to obtain a fused vector; and determining the at least one entity mention based on the fused vector. 3. The method according to claim 2 , wherein the determining the at least one entity mention based on the fused vector comprises: performing attention enhancement on the fused vector to obtain an enhanced vector; classifying the enhanced vector twice to obtain a head position and a tail position of each of the entity mention; and determining each of the entity mention based on the obtained head position and the obtained tail position. 4. The method according to claim 1 , wherein the determining the reference text of each of the candidate entity comprises: acquiring, for each candidate entity, at least one description text of the candidate entity; and splicing each of the description text to obtain the reference text of the candidate entity. 5. The method according to claim 1 , wherein the additional feature information comprises an entity embedding vector; and the determining the additional feature information of each of the candidate entity comprises: acquiring, for each candidate entity, description information of the candidate entity; acquiring a triplet sequence related to the candidate entity; and determining the entity embedding vector of the candidate entity based on the candidate entity, the description information, the triplet sequence, and a pretrained vector determining model. 6. The method according to claim 1 , wherein the additional feature information comprises at least one upperseat concept and a probability corresponding to each of the upperseat concept; and the determining the additional feature information of each of the candidate entity comprises: determining, for each candidate entity, at least one upperseat concept of the candidate entity and the probability corresponding to each of the upperseat concept based on the candidate entity and a preset concept predicting model, to obtain a probability sequence. 7. The method according to claim 1 , wherein the determining the first embedding vector of the target text, the second embedding vector of the target text, the first embedding vector of each of the reference text, and the second embedding vector of each of the reference text comprises: determining a word embedding vector of the target text, a character embedding vector of the target text, a word embedding vector of each of the reference text, and a character embedding vector of each of the reference text respectively; determining the first embedding vector of the target text based on the word embedding vector of the target text, the character embedding vector of the target text, and a first preset vector determining model; determining the second embedding vector of the target text based on the target text and a second preset vector determining model; and determining, for each reference text, the first embedding vector of the reference text based on the word embedding vector of the reference text, the character embedding vector of the reference text, and the first preset vector determining model; and determining the second embedding vector of the reference text based on the reference text and the second preset vector determining model. 8. An electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor, wherein: the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, such that the at least one processor can perform operations comprising: acquiring a target text; determining at least one entity mention included in the target text; determining a candidate entity corresponding to each of the entity mention based on a preset knowledge base; determining a reference text of each of the candidate entity and determining additional feature information of each of the candidate entity; and determining an entity linking result based on the target text, each of the reference text, and each piece of the additional feature information, wherein the determining the entity linking result based on the target text, each of the reference text, and each piece of the additional feature information comprises: determining a first embedding vector of the target text, a second embedding vector of the target text, a first embedding vector of each of the reference text, and a second embedding vector of each of the reference text respectively; splicing, for each reference text, the first embedding vector of the reference text, the second embedding vector of the reference text, and additional feature information of a candidate entity corresponding to the reference text, to obtain a first spliced vector; splicing the first embedding vector of the target text, the second embedding vector of the target text, and each of the first spliced vector, to obtain a second spliced vector; and determining a probability of linking each of the candidate entity to the entity mention based on each of the first spliced vector, the second spliced vector, and a preset classification model. 9. The electronic device according to claim 8 , wherein the determining the at least one entity mention included in the target text comprises: determining a text embedding vector and a relevant eigenvector of the target text; fusing the text embedding vector and the relevant eigenvector to obtain a fused vector; and determining the at least one entity mention based on the fused vector. 10. The electronic device according to claim 9 , wherein the determining the at least one entity mention based on the fused vector comprises: performing attention enhancement on the fused vector to obtain an enhanced vector; classifying the enhanced vector twice to obtain a head position and a tail position of each of the entity mention

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06F40/295Primary
Named entity recognition · CPC title
G06F16/3344
using natural language analysis · CPC title
G06F40/30
Semantic analysis · CPC title

Patent family

Related publications grouped by family.

View patent family 71903467

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11704492B2 cover?: A method, apparatus, device, and storage medium for entity linking is disclosed. The method includes: acquiring a target text; determining at least one entity mention included in the target text; determining a candidate entity corresponding to each of the entity mention based on a preset knowledge base; determining a reference text of each of the candidate entity and determining additional feat…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).