What technology area does this patent fall under?

Primary CPC classification G06F40/40. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 07 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for learning latent representations for natural language tasks

US2016004690A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016004690-A1
Application number	US-201514853053-A
Country	US
Kind code	A1
Filing date	Sep 14, 2015
Priority date	Dec 8, 2010
Publication date	Jan 7, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natural language processing task, a second natural language corpus having a target word, and predicts a label for the target word based on the latent representation. In one variation, the target word is one or more word such as a rare word and/or a word not encountered in the first natural language corpus. The system can optionally assigning the label to the target word. The system can operate according to a connectionist model that includes a learnable linear mapping that maps each word in the first corpus to a low dimensional latent space.

First claim

Opening claim text (preview).

We claim: 1 . A method comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 2 . The method of claim 1 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 3 . The method of claim 1 , wherein predicting the label for the target word is further based on a connectionist model. 4 . The method of claim 3 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space. 5 . The method of claim 3 , wherein the connectionist model comprises a classifier that classifies low dimensional representations of words. 6 . The method of claim 1 , further comprising assigning the label to the target word. 7 . The method of claim 1 , wherein the second natural language corpus comprises an input sentence, and wherein the method further comprises performing the predicting of the label for each word in the input sentence in parallel. 8 . The method of claim 1 , wherein the target word is a collection of target words. 9 . A system comprising: a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 10 . The system of claim 9 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 11 . The system of claim 9 , wherein predicting the label for the target word is further based on a connectionist model. 12 . The system of claim 11 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space. 13 . The system of claim 11 , wherein the connectionist model comprises a classifier that classifies low dimensional representations of words. 14 . The system of claim 9 , the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising assigning the label to the target word. 15 . The system of claim 9 , wherein the second natural language corpus comprises an input sentence, and wherein the method further comprises performing the predicting of the label for each word in the input sentence in parallel. 16 . The system of claim 9 , wherein the target word is a collection of target words. 17 . A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising: analyzing a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation. 18 . The computer-readable storage device of claim 17 , wherein the target word is one of a rare word and a word not encountered in the first natural language corpus. 19 . The computer-readable storage device of claim 17 , wherein predicting the label for the target word is further based on a connectionist model. 20 . The computer-readable storage device of claim 19 , wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space.

Assignees

At & T Ip I Lp

Inventors

Classifications

G06F40/40Primary
Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title
G06F17/28Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 46200230

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016004690A1 cover?: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natura…
Who is the assignee on this patent?: At & T Ip I Lp
What technology area does this patent fall under?: Primary CPC classification G06F40/40. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 07 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).