Assignment of semantic labels to a sequence of words using neural network architectures
US-2015066496-A1 · Mar 5, 2015 · US
US9239828B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9239828-B2 |
| Application number | US-201414201670-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 7, 2014 |
| Priority date | Dec 5, 2013 |
| Publication date | Jan 19, 2016 |
| Grant date | Jan 19, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Recurrent conditional random field (R-CRF) embodiments are described. In one embodiment, the R-CFR receives feature values corresponding to a sequence of words. Semantic labels for words in the sequence of words are then generated and each label is assigned to the appropriate one of the words in the sequence of words. The R-CRF used to accomplish these tasks includes a recurrent neural network (RNN) portion and a conditional random field (CRF) portion. The RNN portion receives feature values associated with a word in the sequence of words and outputs RNN activation layer activations data that is indicative of a semantic label. The CRF portion inputs the RNN activation layer activations data output from the RNN for one or more words in the sequence of words and outputs label data that is indicative of a separate semantic label that is to be assigned to each of the words.
Opening claim text (preview).
Wherefore, what is claimed is: 1. A language understanding (LU) system, comprising: a computing device; and a computer program having program modules executable by the computing device, the computing device being directed by the program modules of the computer program to, receive feature values corresponding a sequence of words, generate semantic labels for words in the sequence of words, said semantic label generation comprising using a recurrent conditional random field (R-CRF) comprising, a recurrent neural network (RNN) portion which generates RNN activation layer activations data that is indicative of a semantic label for a word, the RNN receiving feature values associated with a word in the sequence of words and outputting RNN activation layer activations data that is indicative of a semantic label, and a conditional random field (CRF) portion which takes as input the RNN activation layer activations data output from the RNN for one or more words in the sequence of words and outputs label data that is indicative of a separate semantic label that is to be assigned to each of the one or more words in the sequence of words associated with the RNN activation layer activations data, and assign each semantic label corresponding to the data output by the CRF portion of the R-CRF to the appropriate one said one or more words in the sequence of words. 2. The system of claim 1 , wherein the RNN activation layer activations data comprises data output by the activation layer of the RNN prior to any softmax normalization. 3. The system of claim 1 , wherein the RNN and CRF portions of the R-CRF are jointly trained using a set of training data pair sequences and a CRF sequence-level objective function, each of said training data pair sequences comprising a sequence of pairs of feature values corresponding to a word and label data that is indicative of a correct semantic label for that word. 4. The system of claim 1 , wherein the RNN portion of the R-CRF comprises: an input layer of nodes wherein each feature value of the feature values associated with a word are input into a different one of the input layer nodes; a hidden layer comprising nodes that are connected to outputs of the input layer, each connection between the input layer and hidden layer being adjustably weighted; and an activation layer comprising nodes that are connected to outputs of the hidden layer, each connection between the hidden layer and activation layer being adjustably weighted, and wherein outputs of the activation layer are connected to inputs of the CRF portion of the R-CRF. 5. The system of claim 4 , wherein the feature values associated with a word form a multi-dimensional input vector having a number of elements equal to or larger than a size of a vocabulary of words, and wherein the input layer of nodes comprises a different node for each element of the input vector. 6. The system of claim 4 , wherein the label data output from the CRF portion of the R-CRF forms a multi-dimensional output vector having a number of elements equal to a number of possible semantic labels, and wherein the CRF portion of the R-CRF comprises output nodes equaling the number of output vector elements and a different output node of which is dedicated to each different element of the output vector. 7. The system of claim 4 , wherein the RNN activation layer activations data output from the RNN portion of the R-CRF in response to the input of feature values associated with a word in the sequence of words is input into the nodes of the hidden layer along with the data output from the input layer upon input of feature values associated with a next word in the sequence of words input into the input layer. 8. The system of claim 7 , wherein RNN activation layer activations data input into the nodes of the hidden layer is adjustably weighted prior to input. 9. The system of claim 4 wherein the hidden layer is fully-connected to the input layer and activation layer such that each node of the hidden layer is connected to each node of the input layer and each node of the activation layer. 10. The system of claim 4 , wherein the RNN portion of the R-CRF further comprises a feature layer which is used to input ancillary information into the RNN portion, said feature layer being comprised of nodes which input ancillary information values and output representative ancillary data, wherein an output of each of said feature layer nodes is connected to an input of each hidden layer node via a weighted hidden layer connection and to an input of each activation layer node via a weighted activation layer connection. 11. The system of claim 4 , wherein the RNN portion of the R-CRF further comprises one or more additional hidden layers, each additional hidden layer being fully connected to the layer preceding the additional hidden layer and the layer subsequent to the additional hidden layer such that each node of the additional hidden layer is connected to each node of the preceding layer and each node of the subsequent layer. 12. A recurrent conditional random field (R-CRF), comprising: a recurrent neural network (RNN) portion which generates RNN activation layer activations data that is indicative of a label for a word, the RNN receiving feature values associated with a word in the sequence of words and outputting RNN activation layer activations data that is indicative of a label, said RNN portion comprising, an input layer of nodes wherein each feature value of the feature values associated with a word are input into a different one of the input layer nodes, a hidden layer comprising nodes that receive outputs from the input layer, said outputs from the input layer being adjustably weighted, and an activation layer comprising nodes that receive outputs from the hidden layer, said outputs from the hidden layer being adjustably weighted; and a conditional random field (CRF) portion which takes as input the RNN activation layer activations data output from the activation layer of the RNN portion for words in the sequence of words and which outputs label data that is indicative of a separate label that is to be assigned to each of the words in the sequence of words associated with the RNN activation layer activations data. 13. The R-CRF of claim 12 , wherein the RNN activation layer activations data comprises data output by the activation layer of the RNN prior to any softmax normalization. 14. The R-CRF of claim 12 , wherein the RNN and CRF portions of the R-CRF are jointly trained using a set of training data pair sequences and a CRF sequence-level objective function, each of said training data pair sequences comprising a sequence of pairs of feature values corresponding to a word and label data that is indicative of a correct label for that word. 15. The R-CRF of claim 12 , wherein the RNN activation layer activations data output from the RNN portion of the R-CRF in response to the input of feature values associated with a word in the sequence of words is input into the nodes of the hidden layer along with the data output from the input layer upon input of feature values associated with a next word in the sequence of words input into the input layer. 16. The R-CRF of claim 15 , wherein RNN activation layer activations data input into the nodes of the hidden layer is adjustably weighted prior to input. 17. The R-CRF of claim 12 , wherein the RNN portion of the R-CRF further comprises a feature layer which is used to input ancillary information into the RNN portion, said feature layer being comprised of nodes which input ancillary informatio
Semantic analysis · CPC title
Probabilistic or stochastic networks · CPC title
Neural networks · CPC title
Supervised learning · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.