Selection device for candidate sequence information for similarity determination, selection method, and use for such device and method
US-2015379197-A1 · Dec 31, 2015 · US
US2016357903A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016357903-A1 |
| Application number | US-201415023355-A |
| Country | US |
| Kind code | A1 |
| Filing date | Sep 20, 2014 |
| Priority date | Sep 20, 2013 |
| Publication date | Dec 8, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Current methods for annotating and interpreting human genetic variation typically exploit only a single information type (e.g., conservation) and/or are restricted in scope (e.g., to missense changes). Here, a method for objectively integrating many diverse annotations into a single measure (integrated deleteriousness score, or C-score) for each variant is described. The method may be implemented as a support vector machine (SVM) trained to differentiate high-frequency human-derived alleles from simulated variants. C-scores were precomputed for all 8.6 billion possible human single-nucleotide variants and allow scoring of short insertions-deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects and complex trait associations, and they highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current single-annotation method.
Opening claim text (preview).
1 . A method performed by a computing system for determining the relative effect of a genetic variant comprising: applying a machine learning model to a dataset, wherein the dataset comprises one or more genetic variants, each of which is associated with values or states of each of a set of annotations; and calculating an integrated deleteriousness score for each of the one or more genetic variants; wherein the integrated deleteriousness score of each genetic variant is used to determine the relative effect of said genetic variant when compared to a set of reference deleteriousness scores. 2 . The method of claim 1 , wherein the machine learning model is a support vector machine (SVM) model. 3 . The method of claim 2 , wherein the SVM model is trained to distinguish between a set of simulated variants and a set of observed variants. 4 . The method of claim 2 , wherein the SVM model is trained using a linear kernel on features derived from an annotation matrix. 5 . The method of claim 4 , wherein the SVM model fits a hyperplane defined by: 0 = β 0 + ∑ i = 1 166 β i X i + ∑ i = 1 5 ∑ j = 1 5 γ ij 1 { ith Ref category and j th Alt category } + ∑ i = 1 21 ∑ j = 1 21 δ ij 1 { ith oAA category and jth nAA category } + ∑ i = 1 11 τ i W i + ∑ i = 1 17 ∑
ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title
Physics · mapped topic
Physics · mapped topic
with arrangements for mixing one gas and one liquid · CPC title
Unsupervised data analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.