Network-based deep learning technology for target identification and drug repurposing
US-2021142173-A1 · May 13, 2021 · US
US12176074B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12176074-B2 |
| Application number | US-202118278411-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 25, 2021 |
| Priority date | Feb 23, 2021 |
| Publication date | Dec 24, 2024 |
| Grant date | Dec 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure provides a compound function prediction method based on a neural network and a connectivity map (CMAP) algorithm. The compound function prediction method is used to predict an efficacy of a compound, and the compound function prediction method includes the following steps: constructing a compound molecule-encoding vector neural network; constructing and training an encoding vector-marker gene expression variation deep neural network; constructing and training a marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network; constructing upregulated and downregulated gene sets of a disease or a phenotype; and evaluating a correlation between the compound and the disease or the phenotype.
Opening claim text (preview).
What is claimed is: 1. A compound function prediction method based on a neural network and a connectivity map algorithm, wherein the compound function prediction method is used to predict an efficacy of a compound, and the compound function prediction method comprising the following steps: (1) acquiring, by a processor, a molecular formula of the compound from a public database; constructing a compound molecule-encoding vector neural network according to a molecular fingerprint generated based on a molecular structure or a molecular fingerprint acquired from another representation of the molecular structure; and training an autoencoder or a variational autoencoder based on the molecular formula of the compound for outputting an encoding vector of the compound molecule or outputting an encoding vector and an encoding radius vector of the compound molecule; (2) acquiring, by the processor, compound-gene expression variation data from a public database; connecting a multi-layer deep learning network to the compound molecule-encoding vector neural network constructed in the step (1); and constructing and training an encoding vector-marker gene expression variation deep neural network for outputting a marker gene expression variation; (3) acquiring, by the processor, marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation data from a public database; connecting a multi-layer deep learning network to the encoding vector-marker gene expression variation deep neural network constructed in the step (2); fitting with a linear equation or a nonlinear equation; constructing and training a marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network; and calculating a linear or nonlinear relationship between a marker gene expression level or gene expression variation and a whole genome gene expression level or gene expression variation; (4) acquiring, by the processor, a variation in a disease expression profile from a public database; defining upregulated and downregulated gene sets of a disease or a phenotype; and constructing, based on the compound molecule-encoding vector neural network, the encoding vector-marker gene expression variation deep neural network, the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and the upregulated and downregulated gene sets of the disease or the phenotype, a compound and disease or phenotype correlation evaluation system, comprising the compound molecule-encoding vector neural network, the encoding vector-marker gene expression variation deep neural network, the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and an evaluation system for a correlation between the whole genome gene expression level or gene expression variation corresponding to the compound and the upregulated and downregulated gene sets of the disease or the phenotype; and (5) inputting, by the processor, a molecular formula of a virtual molecule to be evaluated into the compound and disease or phenotype correlation evaluation system to evaluate a correlation between the compound and the disease or the phenotype, specifically comprising, inputting the molecular formula of the virtual molecule to be evaluated into the compound molecule-encoding vector neural network, and outputting an encoding vector; inputting the encoding vector into the encoding vector-marker gene expression variation deep neural network, and outputting the marker gene expression variation; inputting the marker gene expression variation into the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and outputting the whole genome gene expression level or gene expression variation; and inputting the whole genome gene expression level or gene expression variation and the upregulated and downregulated gene sets of the disease or the phenotype into the evaluation system for the correlation between the whole genome gene expression level or gene expression variation corresponding to the compound and the upregulated and downregulated gene sets of the disease or the phenotype, and outputting a score based on a scoring formula for a probability of the compound in treating or exacerbating the disease; wherein the scoring formula is as follows: a = max j = 1 tot [ j t - V ( j ) n ] b = max j = 1 tot [ j t - V ( j ) n ] score = { a - b when b * a < 0 0 when b * a > 0 wherein t denotes a number of upregulated (or downregulated) genes, n denotes a total number of genes in a whole genome expression variation prediction system, and V(j) denotes a sequence number of a j-th upregulated (or downregulated) gene in a descending order of gene expression variations among the n genes; and (6) screening out compounds
Machine learning, data mining or chemometrics · CPC title
Programming languages; Computing architectures; Database systems; Data warehousing · CPC title
Analysis or design of chemical reactions, syntheses or processes · CPC title
Prediction of properties of chemical compounds, compositions or mixtures · CPC title
Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.