Relation graph optimization using inconsistent cycle detection
US-10885452-B1 · Jan 5, 2021 · US
US11630953B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11630953-B2 |
| Application number | US-201916960014-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 25, 2019 |
| Priority date | Jul 25, 2019 |
| Publication date | Apr 18, 2023 |
| Grant date | Apr 18, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described herein are embodiments for end-to-end reinforcement learning based coreference resolution models to directly optimize coreference evaluation metrics. Embodiments of a reinforced policy gradient model are disclosed to incorporate reward associated with a sequence of coreference linking actions. Furthermore, maximum entropy regularization may be used for adequate exploration to prevent a model embodiment from prematurely converging to a bad local optimum. Experiments on datasets compared with state-of-the-art methods verified the effectiveness of embodiments.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for training a coreference resolution model comprising: [a] inputting a document comprising a set of text into a policy network to identify mentions in the document; [b] given a current identified mention in the document, using the policy network to obtain a probability distribution of a set of actions in which the set of actions comprise linking the current identified mention with a prior identified mention or not linking the current identified mention to any prior identified mention; [c] selecting an action from the set of actions using the probability distribution of actions; [d] based upon the selected action, updating a coreference graph for the document, in which the coreference graph comprises mentions as nodes and links representing coreference connections between mentions; [e] responsive to the document having another mention, selecting it as the current identified mention and returning to step [b]; [f] responsive to the document not having another mention, outputting the coreference graph for the document; [g] using the outputted coreference graph and ground truth coreference information for the document, computing a reward based upon one or more metrics; [h] using a trajectory of selected actions and the reward to compute a gradient; and [i] updating the policy network using the gradient. 2. The computer-implemented method of claim 1 wherein the policy network is pre-trained using training steps comprising: inputting a set of documents into the policy network that identifies mentions in the documents and generates a coreference graph for each document; using corresponding ground-truth coreference graphs for the document to compute a loss relative to the generated coreference graphs obtained from the policy network; using the loss to update the policy network; and iterating the above training steps until a stopping condition is reached, the step condition comprises one or more criteria from number of epochs, error level, or number of iterations. 3. The computer-implemented method of claim 1 further comprising: repeating the steps of [a]-[f] for the document to obtain a set of coreference graphs and a corresponding set of trajectories of actions for each document in an iterative operation; obtaining a sample set of coreference graphs from the set of coreference graphs; computing a reward for each coreference graph from the sample set of coreference graphs; and using the rewards and trajectories of actions in the sample set to compute a gradient. 4. The computer-implemented method of claim 1 wherein inputting a document comprising a set of text into a policy network to identify mentions in the document comprises: generating, using a character and word embeddings encoder, a plurality of embeddings with each embedding as a concatenation of fixed pretrained word embeddings and convolutional neural network (CNN) character embeddings; computing and concatenating, using a bidirectional Long short-term memory (LSTM) layer, contextualized representation of each word in the input document from two directions; performing iterative operations comprising: generating, with head-finding attention, span representation from the concatenated contextualized representations of each word; obtaining a mention score using a mention feed-forward neural network with a self-attention mechanism based on the generated span representation; obtaining an antecedent score using an antecedent feed-forward neural network with the self-attention mechanism based on the generated span representation; and obtaining a coreference score based on at least the obtained mention score and the generated antecedent score; and computing, using a masked softmax layer, a probability distribution for each mention based at least on the coreference score. 5. The computer-implemented method of claim 4 wherein the probability distribution is only over candidate antecedents for each mention, with probability distribution for mentions after the current mention in the document masked by the masked softmax layer. 6. The computer-implemented method of claim 4 wherein the self-attention mechanism averages over a previous iteration's representations weighted by a normalized coreference scores. 7. The computer-implemented method of claim 4 wherein the generated span representations with probability scores less than a predetermined threshold are pruned from coreference decisions. 8. A system for training a coreference resolution model, comprising at least one processor, and a memory storing instructions, wherein the instructions when executed by the at least one processor, cause the at least one processor to perform the computer-implemented method of claim 1 . 9. A computer-implemented method for coreference resolution using a coreference resolution model comprising: receiving a document comprising a set of words; generating, using a character and word embeddings encoder, a plurality of embeddings with each embedding as a concatenation of fixed word embeddings and convolutional neural network (CNN) character embeddings; computing and concatenating, using a bidirectional Long short-term memory (LSTM) layer, contextualized representation of each word in the document from two directions; performing iterative operations comprising: generating, with head-finding attention, span representation from the concatenated contextualized representations for a current mention; obtaining a mention score using a mention feed-forward neural network with a self-attention mechanism based on the generated span representation; obtaining an antecedent score using an antecedent feed-forward neural network with the self-attention mechanism based on the generated span representation; and obtaining a coreference score for the current mention based on at least the obtained mention score and the obtained antecedent score; computing, using a masked softmax layer, probability distribution over a set of actions in which the set of actions comprise linking the current identified mention with a prior identified mention or not linking the current identified mention to any prior identified mention for the current mention based at least on the coreference score; selecting an action from the set of actions using the probability distribution of actions; and based upon the selected action, updating a coreference graph for the document, in which the coreference graph comprises mentions as nodes and links representing coreference connections between mentions. 10. The computer-implemented method of claim 9 wherein the coreference resolution model is pretrained using steps comprising: inputting a training document into the coreference resolution model to generate a set of coreference graphs and a corresponding set of trajectories of actions for each document in an iterative operation; obtaining a sample set of coreference graphs from the set of coreference graphs; computing a reward for each coreference graph from the sample set of coreference graphs; using the rewards and trajectories of actions in the sample set to compute a gradient; and using the gradient to update parameters of the coreference resolution model. 11. The computer-implemented method of claim 10 wherein the gradient further comprising an entropy regularization parameter to control exploration of the set of trajectories of actions. 12. The computer-implemented method of claim 11 wherein the set of trajectories of actions are sampled based on a current policy when the entropy regularization parameter is set as 0. 13. The computer-implemented method of claim 11 wherein
Convolutional networks [CNN, ConvNet] · CPC title
Reinforcement learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
Probabilistic or stochastic networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.