Compound function prediction method based on neural network and connectivity map algorithm

US12176074B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12176074-B2
Application numberUS-202118278411-A
CountryUS
Kind codeB2
Filing dateApr 25, 2021
Priority dateFeb 23, 2021
Publication dateDec 24, 2024
Grant dateDec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides a compound function prediction method based on a neural network and a connectivity map (CMAP) algorithm. The compound function prediction method is used to predict an efficacy of a compound, and the compound function prediction method includes the following steps: constructing a compound molecule-encoding vector neural network; constructing and training an encoding vector-marker gene expression variation deep neural network; constructing and training a marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network; constructing upregulated and downregulated gene sets of a disease or a phenotype; and evaluating a correlation between the compound and the disease or the phenotype.

First claim

Opening claim text (preview).

What is claimed is: 1. A compound function prediction method based on a neural network and a connectivity map algorithm, wherein the compound function prediction method is used to predict an efficacy of a compound, and the compound function prediction method comprising the following steps: (1) acquiring, by a processor, a molecular formula of the compound from a public database; constructing a compound molecule-encoding vector neural network according to a molecular fingerprint generated based on a molecular structure or a molecular fingerprint acquired from another representation of the molecular structure; and training an autoencoder or a variational autoencoder based on the molecular formula of the compound for outputting an encoding vector of the compound molecule or outputting an encoding vector and an encoding radius vector of the compound molecule; (2) acquiring, by the processor, compound-gene expression variation data from a public database; connecting a multi-layer deep learning network to the compound molecule-encoding vector neural network constructed in the step (1); and constructing and training an encoding vector-marker gene expression variation deep neural network for outputting a marker gene expression variation; (3) acquiring, by the processor, marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation data from a public database; connecting a multi-layer deep learning network to the encoding vector-marker gene expression variation deep neural network constructed in the step (2); fitting with a linear equation or a nonlinear equation; constructing and training a marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network; and calculating a linear or nonlinear relationship between a marker gene expression level or gene expression variation and a whole genome gene expression level or gene expression variation; (4) acquiring, by the processor, a variation in a disease expression profile from a public database; defining upregulated and downregulated gene sets of a disease or a phenotype; and constructing, based on the compound molecule-encoding vector neural network, the encoding vector-marker gene expression variation deep neural network, the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and the upregulated and downregulated gene sets of the disease or the phenotype, a compound and disease or phenotype correlation evaluation system, comprising the compound molecule-encoding vector neural network, the encoding vector-marker gene expression variation deep neural network, the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and an evaluation system for a correlation between the whole genome gene expression level or gene expression variation corresponding to the compound and the upregulated and downregulated gene sets of the disease or the phenotype; and (5) inputting, by the processor, a molecular formula of a virtual molecule to be evaluated into the compound and disease or phenotype correlation evaluation system to evaluate a correlation between the compound and the disease or the phenotype, specifically comprising, inputting the molecular formula of the virtual molecule to be evaluated into the compound molecule-encoding vector neural network, and outputting an encoding vector; inputting the encoding vector into the encoding vector-marker gene expression variation deep neural network, and outputting the marker gene expression variation; inputting the marker gene expression variation into the marker gene expression level or gene expression variation-whole genome gene expression level or gene expression variation neural network, and outputting the whole genome gene expression level or gene expression variation; and inputting the whole genome gene expression level or gene expression variation and the upregulated and downregulated gene sets of the disease or the phenotype into the evaluation system for the correlation between the whole genome gene expression level or gene expression variation corresponding to the compound and the upregulated and downregulated gene sets of the disease or the phenotype, and outputting a score based on a scoring formula for a probability of the compound in treating or exacerbating the disease; wherein the scoring formula is as follows: a = max j = 1 ⁢ tot [ j t - V ⁡ ( j ) n ] b = max j = 1 ⁢ tot [ j t - V ⁡ ( j ) n ] score = { a - b when ⁢ b * a < 0 0 when ⁢ b * a > 0 wherein t denotes a number of upregulated (or downregulated) genes, n denotes a total number of genes in a whole genome expression variation prediction system, and V(j) denotes a sequence number of a j-th upregulated (or downregulated) gene in a descending order of gene expression variations among the n genes; and (6) screening out compounds

Assignees

Inventors

Classifications

  • Machine learning, data mining or chemometrics · CPC title

  • G16C20/90Primary

    Programming languages; Computing architectures; Database systems; Data warehousing · CPC title

  • Analysis or design of chemical reactions, syntheses or processes · CPC title

  • G16C20/30Primary

    Prediction of properties of chemical compounds, compositions or mixtures · CPC title

  • Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12176074B2 cover?
The present disclosure provides a compound function prediction method based on a neural network and a connectivity map (CMAP) algorithm. The compound function prediction method is used to predict an efficacy of a compound, and the compound function prediction method includes the following steps: constructing a compound molecule-encoding vector neural network; constructing and training an encodi…
Who is the assignee on this patent?
Beijing Gigaceuticals Tech Co Ltd, Univ Beijing
What technology area does this patent fall under?
Primary CPC classification G16C20/90. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).