Sequence-based structured prediction for semantic parsing
US-9830315-B1 · Nov 28, 2017 · US
US10380236B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10380236-B1 |
| Application number | US-201715712933-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 22, 2017 |
| Priority date | Sep 22, 2017 |
| Publication date | Aug 13, 2019 |
| Grant date | Aug 13, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are disclosed to implement a machine learning system that is trained to assign annotations to text fragments in an unstructured sequence of text. The system employs a neural model that includes an encoder recurrent neural network (RNN) and a decoder RNN. The input text sequence is encoded by the encoder RNN into successive encoder hidden states. The encoder hidden states are then decoded by the decoder RNN to produce a sequence of annotations for text fragments within the text sequence. In embodiments, the system employs a fixed-attention window during the decoding phase to focus on a subset of encoder hidden states to generate the annotations. In embodiments, the system employs a beam search technique to track a set of candidate annotation sequences before the annotations are outputted. By using a decoder RNN, the neural model is better equipped to capture long-range annotation dependencies in the text sequence.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving an unstructured text sequence comprising a plurality of words; encoding the plurality of words into a plurality of encoder hidden states of an encoder recurrent neural network (RNN), wherein the encoder RNN is configured to generate a given encoder hidden state based at least in part on a word and a previous encoder hidden state; decoding one or more of the plurality of encoder hidden states into a plurality of decoder hidden states of a decoder RNN, wherein the decoder RNN is configured to generate a given decoder hidden state based at least in part on a previous decoder hidden state and an output of the previous decoder hidden state; and generating a sequence of annotations for individual text fragments in the unstructured text from to the decoder hidden states, wherein the annotations are selected from a set of annotations used to train the encoder RNN and decoder RNN, and generating an annotation in the sequence comprises: determining a particular encoder hidden state associated with a particular word in the unstructured text sequence that is in a position aligned with the annotation; determining a fixed-size attention window of a subset of the plurality of encoder hidden states surrounding the position of the particular encoder state; and generating the annotation based at least in part on the subset of encoder hidden states in the attention window. 2. The computer-implemented method of claim 1 , generating the annotation comprises: generating a context vector that concatenates the encoder hidden states in the attention window. 3. The computer-implemented method of claim 2 , wherein generating the annotation comprises: generating the context vector using a weight vector applied to individual ones of the encoder hidden states in the attention window; and generating the annotation based at least in part on the output of the particular decoder hidden state and the context vector. 4. The computer-implemented method of claim 1 , wherein encoding a particular word in an encoder hidden state comprises: generating an encoding of the word that indicates a character type for each character in the word, the character type comprising one or more of: a digit, an uppercase letter, a lowercase letter, or a symbol. 5. The computer-implemented method of claim 1 , wherein decoding the encoder hidden states into decoder hidden states comprises: generating a beam search tree for a particular decoder hidden state, wherein each path of the beam search tree represents a potential sequence of subsequent annotations, and an annotation generated from the particular decoder hidden state is generated based at least in part on respective probabilities of the potential sequences. 6. The computer-implemented method of claim 5 , further comprising: pruning one or more paths from the beam search tree based at least in part on their respective probabilities. 7. The computer-implemented method of claim 1 , wherein: the unstructured text sequence comprising a description of an item; the annotations indicate features of the item reflected by different text fragments in the description; and further comprising determining metadata for the item based at least in part on the annotations. 8. The computer-implemented method of claim 7 , wherein determining metadata for the item comprises one or more of: determining a title for the item, determining a category of the item, or verifying the description against the item's other metadata. 9. The computer-implemented method of claim 1 , further comprising: performing a model complexity tuning process for a text annotation model including the encoder RNN and decoder RNN, the tuning process comprising, iteratively: training the text annotation model on a set of synthetic training data wherein truth labels for the synthetic training data are assigned in a pseudorandom fashion; determining that a model error of the text annotation model after the training is below an error level of a naïve model that randomly generates annotations based on relative proportions of the annotations; and reducing a complexity parameter of the text annotation model based at least in part on the determination that the model error is below the error level of the naïve model. 10. The computer-implemented method of claim 9 , further comprising: performing a second model complexity tuning process for the text annotation model, the second tuning process comprising, iteratively: training the text annotation model on another set of training data; determining that a second model error of the text annotation model after the training on the other set of training data is below a threshold; and reducing the complexity parameter of the text annotation model based at least in part on the determination that the second model error is below the threshold. 11. A system, comprising: one or more hardware processors with associated memory, implementing a machine learning system configured to: receive an unstructured text sequence comprising a plurality of words; encode the plurality of words into a plurality of encoder hidden states of an encoder recurrent neural network (RNN), wherein the encoder RNN is configured to generate a given encoder hidden state based at least in part on a word and a previous encoder hidden state; decode one or more of the plurality of encoder hidden states into a plurality of decoder hidden states of a decoder RNN, wherein to generate a given decoder hidden state based at least in part on a previous decoder hidden state and an output of the previous decoder hidden state; and generate a sequence of annotations for individual text fragments in the unstructured text from to the decoder hidden states, wherein the annotations are selected from a set of annotations used to train the encoder RNN and decoder RNN, and to generate an annotation in the sequence, the decoder RNN is configured to: determine a particular encoder hidden state associated with a particular word in the unstructured text sequence that is in a position aligned with the annotation; determine a fixed-size attention window of a subset of the plurality of encoder hidden states surrounding the position of the particular encoder state; and generate the annotation based at least in part on the subset of encoder hidden states in the attention window. 12. The system of claim 11 , wherein to generate the annotation, the system is configured to: generate a context vector based on the encoder hidden states in the attention window, wherein the context vector is generated using a weight vector applied to individual ones of the encoder hidden states. 13. The system of claim 11 , wherein to encode a particular word in an encoder hidden state, the encoder RNN is configured to: generate an encoding of the word that indicates a character type for each character in the word, the character type comprising one or more of: a digit, an uppercase letter, a lowercase letter, or a symbol. 14. The system of claim 11 , wherein to decode the encoder hidden states into decoder hidden states, the decoder RNN is configured to: generate a beam search tree for a particular decoder hidden state, wherein each path of the beam search tree represents a potential sequence of subsequent annotations, and an annotation generated from the particular decoder hidden state is generated based at least in part on respective probabilities of the potential sequences. 15. The system of claim 11 , wherein: the unstructured text sequence comprising a description of an item; the annotat
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Semantic analysis · CPC title
Annotation, e.g. comment data or footnotes · CPC title
Character encoding · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.