Methods and processes for non-invasive assessment of genetic variations
US-2015100244-A1 · Apr 9, 2015 · US
US10490299B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10490299-B2 |
| Application number | US-201514732244-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 5, 2015 |
| Priority date | Jun 6, 2014 |
| Publication date | Nov 26, 2019 |
| Grant date | Nov 26, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Illustrative embodiments of systems and methods for the identification of traits associated with DNA samples using epigenetic-based patterns detected via massively parallel sequencing (MPS) are disclosed. Illustrative embodiments may involve digesting a DNA sample with a methylation-dependent endonuclease, amplifying loci of the digested DNA sample (including a positive control locus that does not contain a restriction site for the methylation-dependent endonuclease) using a multiplex PCR to produce amplicons, sequencing the amplicons using an MPS instrument to generate sequence reads, determining a sequence count for each of the loci by comparing each of the sequence reads to reference sequences, normalizing the sequence count for each of the loci to the sequence count of the positive control locus, and identifying a trait associated with the DNA sample by applying a classification algorithm to the normalized sequence counts.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: digesting a deoxyribonucleic acid (DNA) sample with a methylation-dependent endonuclease; amplifying a plurality of loci of the digested DNA sample using a multiplex polymerase chain reaction (PCR) to produce a plurality of amplicons, at least one of the plurality of loci being a positive control locus that does not contain a restriction site for the methylation-dependent endonuclease, the positive control locus being a 97 base pair amplicon located on chromosome 11 between SEQ ID NO: 19 and SEQ ID NO: 20; sequencing the plurality of amplicons using a massively parallel sequencing (MPS) instrument to generate a plurality of sequence reads; determining a sequence count for each of the plurality of loci by comparing each of the plurality of sequence reads to a plurality of reference sequences, each of the plurality of reference sequences being associated with one of the plurality of loci; normalizing the sequence count for each of the plurality of loci to the sequence count of the positive control locus; and identifying a trait associated with the DNA sample by applying a classification algorithm to the normalized sequence counts. 2. The method of claim 1 , wherein comparing each of the plurality of sequence reads to the plurality of reference sequences comprises determining whether each of the plurality of sequence reads sufficiently aligns with any of the plurality of reference sequences. 3. The method of claim 1 , wherein comparing each of the plurality of sequence reads to the plurality of reference sequences comprises determining whether each of the plurality of sequence reads exactly matches any of the plurality of reference sequences. 4. The method of claim 1 , wherein the multiplex PCR uses unlabeled primers. 5. The method of claim 1 , further comprising removing amplicons having a length that is outside a predetermined range prior to sequencing the plurality of amplicons. 6. The method of claim 5 , wherein the predetermined range is about 50 base pairs to about 500 base pairs. 7. The method of claim 1 , wherein applying the classification algorithm comprises applying a k-Nearest Neighbor (k-NN) algorithm. 8. The method of claim 7 , wherein applying the k-NN algorithm comprises computing un-weighted Euclidean distances of normalized sequence counts between the DNA sample and a plurality of reference samples associated with different traits. 9. The method of claim 1 , further comprising: labeling each of the plurality of amplicons with a unique nucleotide index; mixing the plurality of amplicons with additional amplicons that have each been labeled with a unique nucleotide index; and sequencing the additional amplicons using the MPS instrument at the same time as sequencing the plurality of amplicons. 10. The method of claim 9 , wherein the additional amplicons contain short tandem repeats that are used to allelotype the DNA sample. 11. The method of claim 9 , wherein the additional amplicons contain single nucleotide polymorphisms that are used to allelotype the DNA sample. 12. The method of claim 1 , wherein at least one of the plurality of loci is a negative control locus that is substantially digested by the methylation-dependent endonuclease irrespective of the trait associated with the DNA sample. 13. The method of claim 12 , wherein the negative control locus is a 94 base pair amplicon located on chromosome 7 between SEQ ID NO: 17 and SEQ ID NO: 18. 14. The method of claim 1 , wherein the methylation-dependent endonuclease is Hha1. 15. The method of claim 1 , wherein the plurality of loci comprise one or more of: a 66 base pair amplicon located on chromosome 12 between SEQ ID NO: 1 and SEQ ID NO: 2, a 70 base pair amplicon located on chromosome 3 between SEQ ID NO: 3 and SEQ ID NO: 4, a 75 base pair amplicon located on chromosome 3 between SEQ ID NO: 5 and SEQ ID NO: 6, a 75 base pair amplicon located on chromosome 22 between SEQ ID NO: 7 and SEQ ID NO: 8, an 80 base pair amplicon located on chromosome 1 between SEQ ID NO: 9 and SEQ ID NO: 10, an 89 base pair amplicon located on chromosome 19 between SEQ ID NO: 11 and SEQ ID NO: 12, a 100 base pair amplicon located on chromosome 19 between SEQ ID NO: 13 and SEQ ID NO: 14, and a 130 base pair amplicon located on chromosome 19 between SEQ ID NO: 15 and SEQ ID NO: 16. 16. The method of claim 1 , wherein the plurality of loci consist of: a 66 base pair amplicon located on chromosome 12 between SEQ ID NO: 1 and SEQ ID NO: 2, a 70 base pair amplicon located on chromosome 3 between SEQ ID NO: 3 and SEQ ID NO: 4, a 75 base pair amplicon located on chromosome 3 between SEQ ID NO: 5 and SEQ ID NO: 6, a 75 base pair amplicon located on chromosome 22 between SEQ ID NO: 7 and SEQ ID NO: 8, an 80 base pair amplicon located on chromosome 1 between SEQ ID NO: 9 and SEQ ID NO: 10, an 89 base pair amplicon located on chromosome 19 between SEQ ID NO: 11 and SEQ ID NO: 12, a 100 base pair amplicon located on chromosome 19 between SEQ ID NO: 13 and SEQ ID NO: 14, a 130 base pair amplicon located on chromosome 19 between SEQ ID NO: 15 and SEQ ID NO: 16, and a 94 base pair region located on chromosome 7 between SEQ ID NO: 17 and SEQ ID NO: 18. 17. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a tissue source from which the DNA sample was derived. 18. The method of claim 17 , wherein identifying the tissue source from which the DNA sample was derived comprises determining whether the tissue source is blood, skin, saliva, or semen. 19. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a cell type from which the DNA sample was derived. 20. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an age of an organism from which the DNA sample was derived. 21. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a disease state or risk of disease of an organism from which the DNA sample was derived. 22. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a response to environmental signals of an organism from which the DNA sample was derived. 23. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a body mass index or an obesity state of an organism from which the DNA sample was derived. 24. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an expression level of one or more genes in an organism from which the DNA sample was derived. 25. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a physical characteristic of an organism from which the DNA sample was derived. 26. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a drug response of an organism from which the DNA sample was derived. 27. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an epigenetic inheritance of an organism from which the DNA sample was derived. 28. The method of claim 1 , wherein identifying the trait associated with
Methods for sequencing · CPC title
ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title
ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations · CPC title
Signal processing, e.g. from mass spectrometry [MS] or from PCR · CPC title
Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.