System and method for modeling and analyzing data via hierarchical random graphs

US9147273B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9147273-B1
Application numberUS-201113029073-A
CountryUS
Kind codeB1
Filing dateFeb 16, 2011
Priority dateFeb 16, 2011
Publication dateSep 29, 2015
Grant dateSep 29, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention is directed to a data processing apparatus and a computer implemented method for modeling and analyzing relational data represented in a network that includes a plurality of nodes and a plurality of connections between the nodes. The method includes assigning at least one weight to a connection between two nodes in the network. A set of possible dendrograms is then generated for the network, and a likelihood of each dendrogram in the set is determined. The determination of the likelihood is based on at least the one weight of the connection. One of the dendrograms from the set is selected as an optimal dendrogram based on the determined likelihood. The selected dendrogram is then output via an output device. The dendrogram may be used to predict missing links or identify any possible false-positive (noisy) links within a relational dataset.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for modeling and analyzing relational data represented in a network including a plurality of nodes and a plurality of connections between the nodes, the method comprising: assigning a vector of at least two attributes to a connection between two nodes in the network, each of the at least two attributes having a corresponding weight, each of the corresponding weights being greater than 0 and less than or equal to 1; generating a set of possible dendrograms for the network; determining a first likelihood of each dendrogram in the set, wherein the first likelihood is based on the weight of a first attribute of the vector of at least two attributes of the connection; determining a second likelihood of each dendrogram of the set, wherein the second likelihood is based on the weight of a second attribute of the vector of at least two attributes of the connection, the second attribute being different from the first attribute; selecting one of the dendrograms from the set based on the product of the first likelihood and the second likelihood; and outputting the selected dendrogram via an output device, wherein there is no more than one connection between any two nodes of the nodes, and wherein a first weight of a first connection of the connections has a value different from a second weight of a second connection of the connections. 2. The method of claim 1 , wherein the weight is reflective of a strength of the connection between the two nodes in the network. 3. The method of claim 1 , wherein the first attribute is a dynamic attribute configured to change over time. 4. The method of claim 3 , wherein the weight is a function of time. 5. The method of claim 1 further comprising: extracting the vector of attributes and the weights for the attributes from a dataset; and generating the network based on the plurality of attributes. 6. The method of claim 5 further comprising: identifying a missing or noisy attribute in the connection between the two nodes. 7. The method of claim 5 further comprising: detecting that a connection is missing between two nodes in the generated network, or that the connection between two nodes in the generated network is a noisy connection. 8. The method of claim 5 wherein the selected dendrogram provides a hierarchical community structure denoting connectivity in the generated network. 9. The method of claim 5 , wherein the attributes are extracted via tensor decomposition. 10. A data processing apparatus adapted for modeling and analyzing relational data represented in a network including plurality of nodes and a plurality of connections between the nodes, the data processing apparatus comprising: a processor; and a memory operably coupled to the processor and having program instructions stored therein, the processor being operable to execute the program instructions, the program instructions including: assigning a vector of at least two attributes to a connection between two nodes in the network, each of the at least two attributes having a corresponding weight, each of the corresponding weights being greater than 0 and less than or equal to 1; generating a set of possible dendrograms for the network; determining a first likelihood of each dendrogram in the set, wherein the first likelihood is based on the weight of a first attribute of the vector of at least two attributes of the connection; determining a second likelihood of each dendrogram of the set, wherein the second likelihood is based on the weight of a second attribute of the vector of at least two attributes of the connection, the second attribute being different from the first attribute; selecting one of the dendrograms from the set based on the product of the first likelihood and the second likelihood; and outputting the selected dendrogram via an output device, wherein there is no more than one connection between any two nodes of the nodes, and wherein a first weight of a first connection of the connections has a value different from a second weight of a second connection of the connections. 11. The data processing apparatus of claim 10 , wherein the weight is reflective of a strength of the connection between the two nodes in the network. 12. The data processing apparatus of claim 10 , wherein the first attribute is a dynamic attribute changing over time. 13. The data processing apparatus of claim 12 , wherein the weight is a function of time. 14. The data processing apparatus of claim 10 , wherein the program instructions further comprise: extracting the vector of attributes and the weights for the attributes from a dataset; and generating the network based on the plurality of attributes. 15. The data processing apparatus of claim 14 , wherein the program instructions further comprise: identifying a missing or noisy attribute in the single connection between the two nodes. 16. The data processing apparatus of claim 14 , wherein the program instructions further comprise: detecting that a connection is missing between two nodes in the generated network, or that the connection between two nodes in the generated network is a noisy connection. 17. The data processing apparatus of claim 10 , wherein the selected dendrogram provides a hierarchical community structure denoting connectivity in the generated network. 18. A non-transitory computer readable medium embodying program instructions for execution by a data processing apparatus, the program instructions adapting a data processing apparatus for modeling and analyzing relational data represented in a network including a plurality of nodes and a plurality of connections between the nodes, the program instructions comprising: assigning a vector of at least two attributes to at least one connection of the plurality of connections, each of the at least two attributes having a corresponding weight, each of the corresponding weights being greater than 0 and less than or equal to 1; generating a set of possible dendrograms for the network; determining a first likelihood of each dendrogram in the set, wherein the first likelihood is based on the weight of a first attribute of the vector of at least two attributes of the connection; determining a second likelihood of each dendrogram of the set, wherein the second likelihood is based on the weight of a second attribute of the vector of at least two attributes of the connection, the second attribute being different from the first attribute; selecting one of the dendrograms from the set based on the product of the first likelihood and the second likelihood; and outputting the selected dendrogram via an output device, wherein there is no more than one connection between any two nodes of the nodes, and wherein a first weight of a first connection of the connections has a value different from a second weight of a second connection of the connections.

Assignees

Inventors

Classifications

  • G06T11/26Primary

    Drawing of charts or graphs · CPC title

  • G06T11/206Primary

    Physics · mapped topic

  • Physics · mapped topic

  • Probabilistic models · CPC title

  • ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9147273B1 cover?
The present invention is directed to a data processing apparatus and a computer implemented method for modeling and analyzing relational data represented in a network that includes a plurality of nodes and a plurality of connections between the nodes. The method includes assigning at least one weight to a connection between two nodes in the network. A set of possible dendrograms is then generat…
Who is the assignee on this patent?
Allen David L, Lu Tsai-Ching, Huber David J, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06T11/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 29 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).