Method and system for network modeling to enlarge the search space of candidate genes for diseases

US10347359B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10347359-B2
Application numberUS-201213526317-A
CountryUS
Kind codeB2
Filing dateJun 18, 2012
Priority dateJun 16, 2011
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

With the advent of low cost, high-throughput whole genome sequencing (“next generation sequencing”), tools are available to assay human genetic variation contributing to inherited disease syndromes. A method is disclosed for prioritization of genetic variants, and identification of disease genes, using network modeling of gene associations.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for generating a gene specific diagnostic platform comprising: obtaining a first set of genetic data using a computing device, wherein the first set of genetic data identifies genes and gene expression associated with the genes; generating a co-expression network of the set of gene expression data using the computing device, wherein the co-expression network identifies a set of modules of genes with similar patterns of gene expression; obtaining a set of known disease associated genes using the computing device; identifying at least one module from the set of modules associated with at least one known disease associated gene from the set of known disease associated genes using the computing device; obtaining a set of genetic variants, wherein the set of genetic variants identifies known genetic variants associated with genes using the computing device; identifying at least one gene from the set of genetic variants associated with the at least one module associated with at least one known disease associated gene using the computing device; and creating a gene specific diagnostic platform based on the at least on gene within the at least one module associated with the at least one known disease associated gene, wherein the gene specific diagnostic platform is used for molecular diagnosis of an individual for inherited disease syndromes. 2. The method of claim 1 , wherein the generating a co-expression network step comprises the steps of: querying the first set of genetic data using a set of terms of interest to retrieve a subset of the first set of genetic data associated with the terms of interest using the computing device; performing data combination and normalization on the subset of the first set of genetic data using the computing device to produce a set of normalized genetic data; generating a first gene co-expression network using the set of normalized genetic data using the computing device, wherein: the gene co-expression network comprises a plurality of vertices and a plurality of edges; each of the plurality of edges represents a link between two of the plurality of vertices; the plurality of vertices contain data representing genes; and the plurality of edges comprise a length value representing the degree of connection strength between two vertices that the edge links; identifying a set of gene co-expression modules within the first gene co-expression network based upon agglomerative hierarchical clustering using the computing device, wherein each gene co-expression module of the set of gene co-expression modules comprises a subset of the plurality of vertices containing data predicted to describe a plurality of trait-associated genes. 3. The method of claim 2 , wherein identifying the set of gene co-expression modules comprises: identifying at least one hierarchical cluster of edges of the first gene co-expression network graph data structure using the computing device; and identifying at least one modular sub-network from the at least one hierarchical cluster using the computing device, wherein the plurality of trait-associated genes are located within the at least one modular sub-network. 4. The method of claim 2 , wherein each gene co-expression module of the set of gene co-expression modules is identified based on path length of the subset of the plurality of vertex data structures to at least one known disease gene. 5. The method of claim 2 , wherein the generating a co-expression network step further comprises the steps of: obtaining a second set of genetic data, wherein the second set of genetic data identifies genes and gene expression associated with the genes using the computing device; querying the second set of genetic data using the set of terms of interest to retrieve a subset of the second set of genetic data associated with the terms of interest using the computing device using the computing device; performing data combination and normalization on the second set of genetic data to produce a second set of normalized genetic data using the computing device; generating a second gene co-expression network using the second set of normalized genetic data using the computing device, wherein: the second gene co-expression network graph comprises a plurality of vertices and a plurality of edges; each of the plurality of edges represents a link between two of the plurality of vertices; the plurality of vertices contain data representing genes; and the plurality of edges comprise a length value representing the degree of connection strength between two vertices that the edge links; identifying a second set of gene co-expression modules within the second gene co-expression network based upon agglomerative hierarchical clustering using the computing device, wherein each gene co-expression module of the second set of gene co-expression modules comprises a subset of the plurality of vertices containing data predicted to describe a plurality of trait-associated genes; and determining reproducibility of each gene co-expression module in the set of gene co-expression modules by calculating intramodular connectivity preservation between each gene in each gene co-expression module in the set of gene co-expression modules and the same gene in each gene co-expression module in the second set of gene co-expression modules using the computing device, wherein the reproducibility indicates module preservation. 6. The method of claim 5 , wherein the second set of genetic data comprises microarray data from diseased model system tissues. 7. The method of claim 1 , wherein the first set of genetic data comprises microarray data from fetal model system tissues. 8. The method of claim 1 , wherein the first set of genetic data comprises genome wide genotyping data. 9. The method of claim 1 , wherein the gene specific diagnostic platform is selected from the group consisting of a capture-based sequencing and hybridization-based genotyping platform. 10. A system for generating a gene specific diagnostic platform comprising: a processor; and a memory connected to the processor and storing an application, wherein the application directs the processor to: obtain a first set of genetic data, wherein the first set of genetic data identifies genes and gene expression associated with the genes; generate a co-expression network of the set of gene expression data, wherein the co-expression network identifies a set of modules of genes with similar patterns of gene expression; obtain a set of known disease associated genes; identify at least one module from the set of modules associated with at least one known disease associated gene from the set of known disease associated genes; obtain a set of genetic variants, wherein the set of genetic variants identifies known genetic variants associated with genes; identify at least one gene from the set of genetic variants associated with the at least one module associated with at least one known disease associated gene; and create a gene specific diagnostic platform based on the at least one gene within the at least one module associated with the at least one known disease associated gene, wherein the gene specific diagnostic platform is used for molecular diagnosis of an individual for inherited disease syndromes. 11. The system of claim 10 , wherein the generation of the first gene co-expression network is accomplished by the application directing the processor to: query the first set of genetic data using a set of terms of interest to retrieve a subset of the first set of genetic data associated with the terms of interest; perform data combination and normalization

Assignees

Inventors

Classifications

  • G16B5/00Primary

    ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10347359B2 cover?
With the advent of low cost, high-throughput whole genome sequencing (“next generation sequencing”), tools are available to assay human genetic variation contributing to inherited disease syndromes. A method is disclosed for prioritization of genetic variants, and identification of disease genes, using network modeling of gene associations.
Who is the assignee on this patent?
Dewey Frederick E, Ashley Euan A, Univ Leland Stanford Junior
What technology area does this patent fall under?
Primary CPC classification G16B5/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).