Automated design of primer sets for nucleic acid amplification
US-2024336954-A1 · Oct 10, 2024 · US
US10558915B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10558915-B2 |
| Application number | US-201916413476-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 15, 2019 |
| Priority date | Oct 16, 2017 |
| Publication date | Feb 11, 2020 |
| Grant date | Feb 11, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The technology disclosed relates to constructing a convolutional neural network-based classifier for variant classification. In particular, it relates to training a convolutional neural network-based classifier on training data using a backpropagation-based gradient update technique that progressively match outputs of the convolutional neutral network-based classifier with corresponding ground truth labels. The convolutional neural network-based classifier comprises groups of residual blocks, each group of residual blocks is parameterized by a number of convolution filters in the residual blocks, a convolution window size of the residual blocks, and an atrous convolution rate of the residual blocks, the size of convolution window varies between groups of residual blocks, the atrous convolution rate varies between groups of residual blocks. The training data includes benign training examples and pathogenic training examples of translated sequence pairs generated from benign variants and pathogenic variants.
Opening claim text (preview).
What is claimed is: 1. A method of constructing a variant pathogenicity classifier, the method including: training a convolutional neural network-based variant pathogenicity classifier, which runs on numerous processors coupled to memory, using as input benign training example pairs and pathogenic training example pairs of reference protein sequences and alternative protein sequences, wherein the alternative protein sequences are generated from benign variants and pathogenic variants; and wherein the benign variants include common human missense variants and non-human primate missense variants occurring on alternative non-human primate codon sequences that share matching reference codon sequences with humans. 2. The method of claim 1 , wherein the common human missense variants have a minor allele frequency (abbreviated MAF) greater than 0.1% across a human population variant dataset sampled from at least 100000 humans. 3. The method of claim 2 , wherein the sampled humans belong to different human subpopulations and the common human missense variants have a MAF greater than 0.1% within respective human subpopulation variant datasets. 4. The method of claim 3 , wherein the human subpopulations include African/African American (abbreviated AFR), American (abbreviated AMR), Ashkenazi Jewish (abbreviated ASJ), East Asian (abbreviated EAS), Finnish (abbreviated FIN), Non-Finnish European (abbreviated NFE), South Asian (abbreviated SAS), and Others (abbreviated OTH). 5. The method of claim 1 , wherein the non-human primate missense variants include missense variants from a plurality of non-human primate species, including Chimpanzee, Bonobo, Gorilla, B. Orangutan, S. Orangutan, Rhesus, and Marmoset. 6. The method of claim 1 , further including, based on an enrichment analysis, accepting a particular non-human primate species for inclusion of missense variants of the particular non-human primate species among the benign variants, wherein the enrichment analysis includes, for the particular non-human primate species, comparing a first enrichment score of synonymous variants of the particular non-human primate species to a second enrichment score of missense identical variants of the particular non-human primate species; wherein missense identical variants are missense variants that share matching reference and alternative codon sequences with humans; wherein the first enrichment score is produced by determining a ratio of rare synonymous variants with a MAF less than 0.1% over common synonymous variants with a MAF greater than 0.1%; and wherein the second enrichment score is produced by determining a ratio of rare missense identical variants with a MAF less than 0.1% over common missense identical variants with a MAF greater than 0.1%. 7. The method of claim 6 , wherein rare synonymous variants include singleton variants. 8. The method of claim 6 , wherein a difference between the first enrichment score and the second enrichment score is within a predetermined range, further including accepting the particular non-human primate species for inclusion of missense variants of the particular non-human primate among the benign variants. 9. The method of claim 6 , wherein the difference being in the predetermined range indicates that the missense identical variants are under a same degree of natural selection as the synonymous variants and therefore benign as the synonymous variants. 10. The method of claim 6 , further including repeatedly applying the enrichment analysis to accept a plurality of non-human primate species for inclusion of missense variants of the non-human primate species among the benign variants. 11. The method of claim 1 , further including using a chi-squared test of homogeneity to compare a first enrichment score of synonymous variants and a second enrichment score of missense identical variants for each of the non-human primate species. 12. The method of claim 1 , wherein a count of the non-human primate missense variants is at least 100000. 13. The method of claim 12 , wherein the count of the non-human primate missense variants is 385236. 14. The method of claim 1 , wherein a count of the common human missense variants is at least 50000. 15. The method of claim 14 , wherein the count of the common human missense variants is 83546. 16. A non-transitory computer readable storage medium impressed with computer program instructions to construct a variant pathogenicity classifier, when executed on a processor, implement a method comprising: training a convolutional neural network-based variant pathogenicity classifier, which runs on numerous processors coupled to memory, using as input benign training example pairs and pathogenic training example pairs of reference protein sequences and alternative protein sequences, wherein the alternative protein sequences are generated from benign variants and pathogenic variants; and wherein the benign variants include common human missense variants and non-human primate missense variants occurring on alternative non-human primate codon sequences that share matching reference codon sequences with humans. 17. The non-transitory computer readable storage medium of claim 16 , implementing the method further comprising, based on an enrichment analysis, accepting a particular non-human primate species for inclusion of missense variants of the particular non-human primate species among the benign variants, wherein the enrichment analysis includes, for the particular non-human primate species, comparing a first enrichment score of synonymous variants of the particular non-human primate species to a second enrichment score of missense identical variants of the particular non-human primate species; wherein missense identical variants are missense variants that share matching reference and alternative codon sequences with humans; wherein the first enrichment score is produced by determining a ratio of rare synonymous variants with a MAF less than 0.1% over common synonymous variants with a MAF greater than 0.1%; and wherein the second enrichment score is produced by determining a ratio of rare missense identical variants with a MAF less than 0.1% over common missense identical variants with a MAF greater than 0.1%. 18. The non-transitory computer readable storage medium of claim 16 , implementing the method further comprising using a chi-squared test of homogeneity to compare a first enrichment score of synonymous variants and a second enrichment score of missense identical variants for each of the non-human primate species. 19. A system including one or more processors coupled to memory, the memory loaded with computer instructions to construct a variant pathogenicity classifier, the instructions, when executed on the processors, implement actions comprising: training a convolutional neural network-based variant pathogenicity classifier, which runs on numerous processors coupled to memory, using as input benign training example pairs and pathogenic training example pairs of reference protein sequences and alternative protein sequences, wherein the alternative protein sequences are generated from benign variants and pathogenic variants; and wherein the benign variants include common human missense variants and non-human primate missense variants occurring on alternative non-human primate codon sequences that share matching reference codon sequences with humans. 20. The system of claim 19 , further implementing actions, based on an enrichment analysis, accepting a particular non-human
relating to pathologies · CPC title
ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title
Combinations of networks · CPC title
Smoothing the distance, e.g. radial basis function networks [RBFN] · CPC title
Activation functions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.