Machine Learning Platform for Polygenic Models
US-2025266129-A1 · Aug 21, 2025 · US
US2022383982A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022383982-A1 |
| Application number | US-202217804416-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 27, 2022 |
| Priority date | May 28, 2021 |
| Publication date | Dec 1, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various embodiments of the present invention describe techniques for generating a polygenic risk score generation machine learning framework that integrates an optimal genetic variant refinement model without requiring brute-force traversal of potential parameter spaces defined by various distinct genetic variant sets. In response, various embodiments of the present invention use holistic Bayesian sampling routines to efficiently generate Bayesian evidence numerical estimates for various genetic variant refinement models and select an optimal genetic variant refinement model accordingly. This enables enhancing the accuracy of polygenic risk score generation machine learning frameworks without resorting to computationally resource-intensive traversals of potential parameter spaces defined by various distinct genetic variant sets. In doing so, various embodiments of the present invention enhance the computational efficiency of generating a polygenic risk score generation machine learning framework that integrates an optimal genetic variant refinement model in contrast to computationally-inefficient techniques that require brute-force traversal of potential parameter spaces.
Opening claim text (preview).
1 . A computer-implemented for generating a polygenic risk score for a target phenotype using a comparatively-refined polygenic risk score generation model, the computer-implemented method comprising: identifying, using one or more processors, the comparatively-refined polygenic risk score generation machine learning framework, wherein: the comparatively-refined polygenic risk score generation machine learning framework comprises an optimal genetic variant refinement model that is selected from a plurality of defined genetic variant refinement models, each defined genetic variant refinement model: (i) is associated with: (a) a distinct per-model genetic variant set of a group of genetic variants, and (b) a per-model parameter set comprising a per-model effect weight parameter set for the distinct per-model genetic variant set that is associated with the defined genetic variant refinement model, and (ii) is configured to generate a per-model polygenic risk score based at least in part on a per-model input feature vector corresponding to the distinct per-model genetic variant set for the defined genetic variant refinement model and the per-model parameter set for the defined genetic variant refinement model, and generating the optimal genetic variant refinement model comprises: (i) for each defined genetic variant refinement model, sampling from a per-model posterior probability distribution for the defined genetic variant refinement model given target genome-wide association data for the target phenotype and by using a holistic Bayesian sampling routine that is configured to generate: (a) a per-model parameter numerical estimate set for the per-model parameter set that is associated with the defined genetic variant refinement model, and (b) a Bayesian evidence numerical estimate for the defined genetic variant refinement model, and (ii) selecting the optimal genetic variant refinement model as the defined genetic variant refinement model with an optimal Bayesian evidence numerical estimate as generated by the holistic Bayesian sampling routine, generating, using the one or more processors, the polygenic risk score based at least in part on the per-model polygenic risk score for the optimal genetic variant refinement model; and performing, using the one or more processors, one or more prediction-based actions based at least in part on the polygenic risk score. 2 . The computer-implemented method of claim 1 , wherein the holistic Bayesian sampling routine comprises a nested sampling routine. 3 . The computer-implemented method of claim 1 , wherein the holistic Bayesian sampling routine comprises a dynamic nested sampling routine. 4 . The computer-implemented method of claim 1 , wherein: the holistic Bayesian sampling routine comprises a nested sampling sub-routine and a dynamic nested sampling sub-routine, and the Bayesian evidence numerical estimate for a particular defined genetic variant refinement model is generated based at least in part on a first Bayesian evidence numerical estimate for the particular defined genetic variant refinement model as generated by the nested sampling sub-routine and a second Bayesian evidence numerical estimate for the particular defined genetic variant refinement model as generated by the dynamic nested sampling sub-routine. 5 . The computer-implemented method of claim 4 , wherein: the Bayesian evidence numerical estimate for the particular defined genetic variant refinement model is generated based at least in part on a cross-estimate weighted combination of the first Bayesian evidence numerical estimate and the second Bayesian evidence numerical estimate, and the cross-estimate weighted combination is generated based at least in part on a first historical model performance quality weight for the nested sampling routine and a second historical model performance quality weight for the dynamic nested sampling routine. 6 . The computer-implemented method of claim 1 , wherein: the comparatively-refined polygenic risk score generation machine learning framework further comprises a cross-model refinement model that is configured to generate a cross-model weighted combination of each per-model polygenic risk score for the plurality of defined genetic variant refinement models, the cross-model weighted combination is generated based at least in part on a plurality of probabilistic model quality weights for the plurality of defined genetic variant refinement models, and each probabilistic model quality weight for a respective defined genetic variant refinement model is generated based at least in part on the Bayesian evidence numerical estimate for the respective defined genetic variant refinement model as generated by the holistic Bayesian sampling routine. 7 . The computer-implemented method of claim 6 , wherein generating the polygenic risk score comprises: adopting the cross-model weighted combination as the polygenic risk score. 8 . An apparatus for generating a polygenic risk score for a target phenotype using a comparatively-refined polygenic risk score generation model, the apparatus comprising at least one processor and at least one memory including program code, the at least one memory and the program code configured to, with the processor, cause the apparatus to at least: identify the comparatively-refined polygenic risk score generation machine learning framework, wherein: the comparatively-refined polygenic risk score generation machine learning framework comprises an optimal genetic variant refinement model that is selected from a plurality of defined genetic variant refinement models, each defined genetic variant refinement model: (i) is associated with: (a) a distinct per-model genetic variant set of a group of genetic variants, and (b) a per-model parameter set comprising a per-model effect weight parameter set for the distinct per-model genetic variant set that is associated with the defined genetic variant refinement model, and (ii) is configured to generate a per-model polygenic risk score based at least in part on a per-model input feature vector corresponding to the distinct per-model genetic variant set for the defined genetic variant refinement model and the per-model parameter set for the defined genetic variant refinement model, and generating the optimal genetic variant refinement model comprises: (i) for each defined genetic variant refinement model, sampling from a per-model posterior probability distribution for the defined genetic variant refinement model given target genome-wide association data for the target phenotype and by using a holistic Bayesian sampling routine that is configured to generate: (a) a per-model parameter numerical estimate set for the per-model parameter set that is associated with the defined genetic variant refinement model, and (b) a Bayesian evidence numerical estimate for the defined genetic variant refinement model, and (ii) selecting the optimal genetic variant refinement model as the defined genetic variant refinement model with an optimal Bayesian evidence numerical estimate as generated by the holistic Bayesian sampling routine, generate the polygenic risk score based at least in part on the per-model polygenic risk score for the optimal genetic variant refinement model; and perform one or more prediction-based actions based at least in part on the polygenic risk score. 9 . The apparatus of claim 8 , wherein the holistic Bayesian sampling routine comprises a nested sampling routine. 10 . The apparatus of claim 8 , wherein the holistic Bayesian sampling routine comprises a dynamic nested sampling routine. 11 . The apparatus of claim 8 , wherein: the holistic Bayesian sampling routine comprises
for mining of medical data, e.g. analysing previous cases of other patients · CPC title
for calculating health indices; for individual health risk assessment · CPC title
Supervised data analysis · CPC title
Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection · CPC title
ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.