Method and System for Joint Representations of Related Concepts

US2016170982A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016170982-A1
Application numberUS-201414572579-A
CountryUS
Kind codeA1
Filing dateDec 16, 2014
Priority dateDec 16, 2014
Publication dateJun 16, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present teaching relates to joint representation of information. In one example, first and second pieces of information are received. Each of the first and second pieces of information relates to one word in a plurality of documents, one of the documents, or one of user to which the documents are given. A model for estimating feature vectors is obtained. The model includes a first neural network model based on a first order of words within one of the documents and a second neural network model based on a second order in which at least some of the documents are given. Based on the model, a first feature vector of the first piece of information and a second feature vector of the second piece of information are estimated. A similarity between the first and second pieces of information is determined based on a distance between the first and second feature vectors.

First claim

Opening claim text (preview).

We claim: 1 . A method implemented on at least one computing device each of which has at least one processor, storage, and a communication platform connected to a network for determining similarity between information, the method comprising: receiving a first piece of information and a second piece of information, wherein each of the first and second pieces of information relates to one word in a plurality of documents, one of the plurality of documents, or one of user to which the plurality of documents are given; obtaining a model for estimating feature vectors of the first and second pieces of information, wherein the model comprises a first neural network model based, at least in part, on a first order of words within one of the plurality of documents and a second neural network model based, at least in part, on a second order in which at least some of the plurality of documents are given; estimating, based on the model, a first feature vector of the first piece of information and a second feature vector of the second piece of information; and determining a similarity between the first and second pieces of information based on a distance between the first and second feature vectors. 2 . The method of claim 1 , further comprising: receiving a query that relates to the first piece of information; and providing the second piece of information as a result of the received query if the determined similarity between the first and second pieces of information is above a threshold. 3 . The method of claim 1 , further comprising: classifying the first and second pieces of information based on the determined similarity between the first and second pieces of information. 4 . The method of claim 1 , wherein the first neural network model is based, at least in part, on the document that contains the words in the first order; and the at least some of the plurality of documents given in the second order include the document that contains the words in the first order. 5 . The method of claim 4 , wherein the second neural network model is based, at least in part, on a user to which the at least some of the plurality of documents are given in the second order. 6 . The method of claim 1 , wherein the model further comprises a third neural network model based, at least in part, on relationship between at least some of the users to which the plurality of documents are given. 7 . The method of claim 1 , wherein the first and second feature vectors are estimated by automatically optimizing the model using a hierarchical softmax approach. 8 . The method of claim 7 , wherein the model is optimized by maximizing log-likelihood of the first order and/or the second order. 9 . The method of claim 1 , wherein dimensionalities of the first and second feature vectors are the same. 10 . A system having at least one processor storage, and a communication platform for determining similarity between information, the system comprising: a data receiving module configured to receive a first piece of information and a second piece of information, wherein each of the first and second pieces of information relates to one word in a plurality of documents, one of the plurality of documents, or one of user to which the plurality of documents are given; a modeling module configured to obtain a model for estimating feature vectors of the first and second pieces of information, wherein the model comprises a first neural network model based, at least in part, on a first order of words within one of the plurality of documents and a second neural network model based, at least in part, on a second order in which at least some of the plurality of documents are given; an optimization module configured to estimate, based on the model, a first feature vector of the first piece of information and a second feature vector of the second piece of information; and a similarity measurement module configured to determine a similarity between the first and second pieces of information based on a distance between the first and second feature vectors. 11 . The system of claim 10 , further comprising: a hybrid query engine configured to receive a query that relates to the first piece of information, and provide the second piece of information as a result of the received query if the determined similarity between the first and second pieces of information is above a threshold. 12 . The system of claim 10 , further comprising: a classification engine configured to classify the first and second pieces of information based on the determined similarity between the first and second pieces of information. 13 . The system of claim 10 , wherein the first neural network model is based, at least in part, on the document that contains the words in the first order; and the at least some of the plurality of documents given in the second order include the document that contains the words in the first order. 14 . The system of claim 13 , wherein the second neural network model is based, at least in part, on a user to which the at least some of the plurality of documents are given in the second order. 15 . The system of claim 10 , wherein the model further comprises a third neural network model based, at least in part, on relationship between at least some of the users to which the plurality of documents are given. 16 . The system of claim 10 , wherein the first and second feature vectors are estimated by automatically optimizing the model using a hierarchical softmax approach. 17 . A non-transitory computer-readable medium having data recorded thereon for determining similarity between information, wherein the data, when read by the machine, causes the machine to perform the following: receiving a first piece of information and a second piece of information, wherein each of the first and second pieces of information relates to one word in a plurality of documents, one of the plurality of documents, or one of user to which the plurality of documents are given; obtaining a model for estimating feature vectors of the first and second pieces of information, wherein the model comprises a first neural network model based, at least in part, on a first order of words within one of the plurality of documents and a second neural network model based, at least in part, on a second order in which at least some of the plurality of documents are given; estimating, based on the model, a first feature vector of the first piece of information and a second feature vector of the second piece of information; and determining a similarity between the first and second pieces of information based on a distance between the first and second feature vectors. 18 . The medium of claim 17 , wherein the first neural network model is based, at least in part, on the document that contains the words in the first order; and the at least some of the plurality of documents given in the second order include the document that contains the words in the first order. 19 . The medium of claim 18 , wherein the second neural network model is based, at least in part, on a user to which the at least some of the plurality of documents are given in the second order. 20 . The medium of claim 17 , wherein the model further comprises a third neural network model based, at least in part, on relationship between at least some of the users to which the plurality of documents are given.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Feedforward networks · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016170982A1 cover?
The present teaching relates to joint representation of information. In one example, first and second pieces of information are received. Each of the first and second pieces of information relates to one word in a plurality of documents, one of the documents, or one of user to which the documents are given. A model for estimating feature vectors is obtained. The model includes a first neural ne…
Who is the assignee on this patent?
Yahoo Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30011. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 16 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).