Natural language processing method and apparatus, device, and readable storage medium

US2024338524A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024338524-A1
Application numberUS-202218696336-A
CountryUS
Kind codeA1
Filing dateJun 30, 2022
Priority dateJan 5, 2022
Publication dateOct 10, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A natural language processing method and apparatus, a device, and a readable storage medium, where the method includes: obtaining a target sentence to be processed, and determining each entity in the target sentence (S101); for each entity in the target sentence, in response to the entity being present in a preset entity set, determining extended information for the entity, and adding the determined extended information after a location of the entity in the target sentence, to obtain an updated target sentence (S102); and inputting the updated target sentence to a bidirectional encoder representations from transformer (BERT) model, such that the BERT model performs a natural language processing task, where in a process in which the BERT model performs the natural language processing task, an attention score between extended information of any entity in the target sentence and another entity in the target sentence is adjusted to zero (S103).

First claim

Opening claim text (preview).

1 . A natural language processing method, comprising: obtaining a target sentence to be processed, and determining each entity in the target sentence; for each entity in the target sentence, in response to the entity being present in a preset entity set, determining extended information for the entity, and adding the determined extended information after a location of the entity in the target sentence, to obtain an updated target sentence; and inputting the updated target sentence to a bidirectional encoder representations from transformer (BERT) model, such that the BERT model performs a natural language processing task, wherein in a process in which the BERT model performs the natural language processing task, an attention score between extended information of any entity in the target sentence and another entity in the target sentence is tuned to zero. 2 . The method according to claim 1 , wherein the determining the extended information for the entity comprises: taking the entity as a target object, and determining, in the preset entity set, an entity group in relation with the target object; and selecting, from the entity group, an entity which has a relation probability value greater than a first threshold, and generating extended information of the target object based on the selected entity. 3 . The method according to claim 2 , wherein the determining, in the preset entity set, the entity group in relation with the target object comprises: generating an N×N×M-dimensional tensor for representing a relation and a relation probability value between entities in the preset entity set, wherein N represents a quantity of entities comprised in the preset entity set, and M represents a quantity of relations between different entities in the preset entity set; and generating a knowledge graph based on the N×N×M-dimensional tensor, and querying, in the knowledge graph, the entity group in relation with the target object. 4 . The method according to claim 3 , wherein the generating the N×N×M-dimensional tensor for representing the relation and the relation probability value between entities in the preset entity set comprises: generating an initial tensor that is all-0 in N×N×M dimensions; obtaining a sentence library for generating the preset entity set, traversing each sentence in the sentence library, and taking the traversed sentence as a sentence to be recognized; taking two adjacent entities in the sentence to be recognized as an entity group, to obtain a plurality of entity groups; recognizing the relation between two entities in each entity group by using a relation recognition model, to obtain a plurality of M-dimensional relation vectors; for each of the plurality of M-dimensional relation vectors, in response to a maximum value in any of the plurality of M-dimensional relation vectors being greater than a second threshold, updating an element at a location, in the initial tensor, which corresponds to the maximum value, from 0 to 1, to update the initial tensor; and traversing and updating a next sentence in the sentence library, and after each sentence in the sentence library is traversed, outputting and optimizing a currently obtained tensor to obtain the N×N×M-dimensional tensor. 5 . The method according to claim 4 , wherein the recognizing the relation between two entities in each entity group by using the relation recognition model, to obtain the plurality of M-dimensional relation vectors comprises: for two entities in any entity group, replacing the two entities in the sentence to be recognized with different identifiers to obtain a replaced sentence, and inputting the replaced sentence to the relation recognition model, such that the relation recognition model outputs an M-dimensional relation vector corresponding to the two entities. 6 . The method according to claim 5 , wherein the optimizing the currently obtained tensor to obtain the N×N×M-dimensional tensor comprises: forming an initial three-dimensional matrix with the currently obtained tensor, and decomposing the initial three-dimensional matrix into M pieces of N×N-dimensional matrices X i , wherein i=1, 2, . . . , M; decomposing a d×d×M-dimensional tensor O through initialization into M pieces of d×d-dimensional matrices O i , wherein d represents an adjustable hyper-parameter; obtaining an N×d-dimensional matrix A through initialization, and calculating optimal A′ and M pieces of optimal O i ′ based on X i =AO i AT and a gradient descent method; obtaining a new three-dimensional matrix based on the optimal A′ and the M pieces of optimal O i ′; and comparing the initial three-dimensional matrix with the new three-dimensional matrix bit by bit based on a max function, and reserving a maximum value at each location, to obtain the N×N×M-dimensional tensor. 7 . The method according to claim 5 , wherein the relation recognition model comprises a sub-model of a transformer structure and a relation classification neural network; and the inputting the replaced sentence to the relation recognition model, such that the relation recognition model outputs the M-dimensional relation vector corresponding to the two entities comprises: inputting the replaced sentence to the sub-model of the transformer structure, to obtain a feature vector with the identifiers of the two entities; and inputting the feature vector with the identifiers of the two entities to the relation classification neural network, to obtain the M-dimensional relation vector corresponding to the two entities. 8 . The method according to claim 1 , wherein the determining the extended information for the entity comprises: taking the entity as a target object, and determining, in the preset entity set, an object entity in maximum correlation with the target object, wherein the object entity is another entity in the preset entity set except the target object; and generating extended information of the target object based on the object entity in maximum correlation with the target object. 9 . The method according to claim 8 , wherein the determining, in the preset entity set, the object entity in maximum correlation with the target object comprises: determining a maximum relation probability value of the target object relative to each object entity, to obtain N−1 pieces of maximum relation probability values, wherein N−1 represents a quantity of object entities, and N represents a quantity of entities comprised in the preset entity set; determining a correlation between each object entity and the target sentence, to obtain N−1 pieces of correlations; for each object entity, calculating a product of the correlation corresponding to the object entity and the maximum relation probability value corresponding to the object entity, to obtain a correlation score corresponding to the object entity to obtain N−1 pieces of correlation scores; and taking an object entity corresponding to a maximum correlation score in the N−1 pieces of correlation scores as the object entity in maximum correlation with the target object. 10 . The method according to claim 9 , wherein the determining the correlation between each object entity and the target sentence comprises: for each object entity, determining a sum of a correlation degree between each entity in the target sentence and the object entity as the correlation between the object entity and the target sentence. 11 . The method according to claim 10 , wherein the correlation degree between the entity in the target sentence and any object entity is a maximum relation probability value of the any entity in the target sentence relative to the object entity plus a maximum relation probability value of th

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024338524A1 cover?
A natural language processing method and apparatus, a device, and a readable storage medium, where the method includes: obtaining a target sentence to be processed, and determining each entity in the target sentence (S101); for each entity in the target sentence, in response to the entity being present in a preset entity set, determining extended information for the entity, and adding the deter…
Who is the assignee on this patent?
Suzhou Metabrain Intelligent Technology Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F40/295. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).