Alternating Positioning of Primary Text
US-2024419887-A1 · Dec 19, 2024 · US
US2016004690A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016004690-A1 |
| Application number | US-201514853053-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 14, 2015 |
| Priority date | Dec 8, 2010 |
| Publication date | Jan 7, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natural language processing task, a second natural language corpus having a target word, and predicts a label for the target word based on the latent representation. In one variation, the target word is one or more word such as a rare word and/or a word not encountered in the first natural language corpus. The system can optionally assigning the label to the target word. The system can operate according to a connectionist model that includes a learnable linear mapping that maps each word in the first corpus to a low dimensional latent space.
Opening claim text (preview).
We claim: 1 . A method comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 2 . The method of claim 1 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 3 . The method of claim 1 , wherein predicting the label for the target word is further based on a connectionist model. 4 . The method of claim 3 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space. 5 . The method of claim 3 , wherein the connectionist model comprises a classifier that classifies low dimensional representations of words. 6 . The method of claim 1 , further comprising assigning the label to the target word. 7 . The method of claim 1 , wherein the second natural language corpus comprises an input sentence, and wherein the method further comprises performing the predicting of the label for each word in the input sentence in parallel. 8 . The method of claim 1 , wherein the target word is a collection of target words. 9 . A system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 10 . The system of claim 9 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 11 . The system of claim 9 , wherein predicting the label for the target word is further based on a connectionist model. 12 . The system of claim 11 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space. 13 . The system of claim 11 , wherein the connectionist model comprises a classifier that classifies low dimensional representations of words. 14 . The system of claim 9 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising assigning the label to the target word. 15 . The system of claim 9 , wherein the second natural language corpus comprises an input sentence, and wherein the method further comprises performing the predicting of the label for each word in the input sentence in parallel. 16 . The system of claim 9 , wherein the target word is a collection of target words. 17 . A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 18 . The computer-readable storage device of claim 17 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 19 . The computer-readable storage device of claim 17 , wherein predicting the label for the target word is further based on a connectionist model. 20 . The computer-readable storage device of claim 19 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space.
Related publications grouped by family.
Answers are generated from the same data shown on this page.