Multi-feature balancing for natural language processors
US-2024419910-A1 · Dec 19, 2024 · US
US2024338524A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024338524-A1 |
| Application number | US-202218696336-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 30, 2022 |
| Priority date | Jan 5, 2022 |
| Publication date | Oct 10, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A natural language processing method and apparatus, a device, and a readable storage medium, where the method includes: obtaining a target sentence to be processed, and determining each entity in the target sentence (S101); for each entity in the target sentence, in response to the entity being present in a preset entity set, determining extended information for the entity, and adding the determined extended information after a location of the entity in the target sentence, to obtain an updated target sentence (S102); and inputting the updated target sentence to a bidirectional encoder representations from transformer (BERT) model, such that the BERT model performs a natural language processing task, where in a process in which the BERT model performs the natural language processing task, an attention score between extended information of any entity in the target sentence and another entity in the target sentence is adjusted to zero (S103).
Opening claim text (preview).
1 . A natural language processing method, comprising: obtaining a target sentence to be processed, and determining each entity in the target sentence; for each entity in the target sentence, in response to the entity being present in a preset entity set, determining extended information for the entity, and adding the determined extended information after a location of the entity in the target sentence, to obtain an updated target sentence; and inputting the updated target sentence to a bidirectional encoder representations from transformer (BERT) model, such that the BERT model performs a natural language processing task, wherein in a process in which the BERT model performs the natural language processing task, an attention score between extended information of any entity in the target sentence and another entity in the target sentence is tuned to zero. 2 . The method according to claim 1 , wherein the determining the extended information for the entity comprises: taking the entity as a target object, and determining, in the preset entity set, an entity group in relation with the target object; and selecting, from the entity group, an entity which has a relation probability value greater than a first threshold, and generating extended information of the target object based on the selected entity. 3 . The method according to claim 2 , wherein the determining, in the preset entity set, the entity group in relation with the target object comprises: generating an N×N×M-dimensional tensor for representing a relation and a relation probability value between entities in the preset entity set, wherein N represents a quantity of entities comprised in the preset entity set, and M represents a quantity of relations between different entities in the preset entity set; and generating a knowledge graph based on the N×N×M-dimensional tensor, and querying, in the knowledge graph, the entity group in relation with the target object. 4 . The method according to claim 3 , wherein the generating the N×N×M-dimensional tensor for representing the relation and the relation probability value between entities in the preset entity set comprises: generating an initial tensor that is all-0 in N×N×M dimensions; obtaining a sentence library for generating the preset entity set, traversing each sentence in the sentence library, and taking the traversed sentence as a sentence to be recognized; taking two adjacent entities in the sentence to be recognized as an entity group, to obtain a plurality of entity groups; recognizing the relation between two entities in each entity group by using a relation recognition model, to obtain a plurality of M-dimensional relation vectors; for each of the plurality of M-dimensional relation vectors, in response to a maximum value in any of the plurality of M-dimensional relation vectors being greater than a second threshold, updating an element at a location, in the initial tensor, which corresponds to the maximum value, from 0 to 1, to update the initial tensor; and traversing and updating a next sentence in the sentence library, and after each sentence in the sentence library is traversed, outputting and optimizing a currently obtained tensor to obtain the N×N×M-dimensional tensor. 5 . The method according to claim 4 , wherein the recognizing the relation between two entities in each entity group by using the relation recognition model, to obtain the plurality of M-dimensional relation vectors comprises: for two entities in any entity group, replacing the two entities in the sentence to be recognized with different identifiers to obtain a replaced sentence, and inputting the replaced sentence to the relation recognition model, such that the relation recognition model outputs an M-dimensional relation vector corresponding to the two entities. 6 . The method according to claim 5 , wherein the optimizing the currently obtained tensor to obtain the N×N×M-dimensional tensor comprises: forming an initial three-dimensional matrix with the currently obtained tensor, and decomposing the initial three-dimensional matrix into M pieces of N×N-dimensional matrices X i , wherein i=1, 2, . . . , M; decomposing a d×d×M-dimensional tensor O through initialization into M pieces of d×d-dimensional matrices O i , wherein d represents an adjustable hyper-parameter; obtaining an N×d-dimensional matrix A through initialization, and calculating optimal A′ and M pieces of optimal O i ′ based on X i =AO i AT and a gradient descent method; obtaining a new three-dimensional matrix based on the optimal A′ and the M pieces of optimal O i ′; and comparing the initial three-dimensional matrix with the new three-dimensional matrix bit by bit based on a max function, and reserving a maximum value at each location, to obtain the N×N×M-dimensional tensor. 7 . The method according to claim 5 , wherein the relation recognition model comprises a sub-model of a transformer structure and a relation classification neural network; and the inputting the replaced sentence to the relation recognition model, such that the relation recognition model outputs the M-dimensional relation vector corresponding to the two entities comprises: inputting the replaced sentence to the sub-model of the transformer structure, to obtain a feature vector with the identifiers of the two entities; and inputting the feature vector with the identifiers of the two entities to the relation classification neural network, to obtain the M-dimensional relation vector corresponding to the two entities. 8 . The method according to claim 1 , wherein the determining the extended information for the entity comprises: taking the entity as a target object, and determining, in the preset entity set, an object entity in maximum correlation with the target object, wherein the object entity is another entity in the preset entity set except the target object; and generating extended information of the target object based on the object entity in maximum correlation with the target object. 9 . The method according to claim 8 , wherein the determining, in the preset entity set, the object entity in maximum correlation with the target object comprises: determining a maximum relation probability value of the target object relative to each object entity, to obtain N−1 pieces of maximum relation probability values, wherein N−1 represents a quantity of object entities, and N represents a quantity of entities comprised in the preset entity set; determining a correlation between each object entity and the target sentence, to obtain N−1 pieces of correlations; for each object entity, calculating a product of the correlation corresponding to the object entity and the maximum relation probability value corresponding to the object entity, to obtain a correlation score corresponding to the object entity to obtain N−1 pieces of correlation scores; and taking an object entity corresponding to a maximum correlation score in the N−1 pieces of correlation scores as the object entity in maximum correlation with the target object. 10 . The method according to claim 9 , wherein the determining the correlation between each object entity and the target sentence comprises: for each object entity, determining a sum of a correlation degree between each entity in the target sentence and the object entity as the correlation between the object entity and the target sentence. 11 . The method according to claim 10 , wherein the correlation degree between the entity in the target sentence and any object entity is a maximum relation probability value of the any entity in the target sentence relative to the object entity plus a maximum relation probability value of th
Entity relationship models · CPC title
Machine learning · CPC title
Semantic analysis · CPC title
Ontology · CPC title
Named entity recognition · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.