Identification of traits associated with DNA samples using epigenetic-based patterns detected via massively parallel sequencing

US10490299B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10490299-B2
Application numberUS-201514732244-A
CountryUS
Kind codeB2
Filing dateJun 5, 2015
Priority dateJun 6, 2014
Publication dateNov 26, 2019
Grant dateNov 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Illustrative embodiments of systems and methods for the identification of traits associated with DNA samples using epigenetic-based patterns detected via massively parallel sequencing (MPS) are disclosed. Illustrative embodiments may involve digesting a DNA sample with a methylation-dependent endonuclease, amplifying loci of the digested DNA sample (including a positive control locus that does not contain a restriction site for the methylation-dependent endonuclease) using a multiplex PCR to produce amplicons, sequencing the amplicons using an MPS instrument to generate sequence reads, determining a sequence count for each of the loci by comparing each of the sequence reads to reference sequences, normalizing the sequence count for each of the loci to the sequence count of the positive control locus, and identifying a trait associated with the DNA sample by applying a classification algorithm to the normalized sequence counts.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: digesting a deoxyribonucleic acid (DNA) sample with a methylation-dependent endonuclease; amplifying a plurality of loci of the digested DNA sample using a multiplex polymerase chain reaction (PCR) to produce a plurality of amplicons, at least one of the plurality of loci being a positive control locus that does not contain a restriction site for the methylation-dependent endonuclease, the positive control locus being a 97 base pair amplicon located on chromosome 11 between SEQ ID NO: 19 and SEQ ID NO: 20; sequencing the plurality of amplicons using a massively parallel sequencing (MPS) instrument to generate a plurality of sequence reads; determining a sequence count for each of the plurality of loci by comparing each of the plurality of sequence reads to a plurality of reference sequences, each of the plurality of reference sequences being associated with one of the plurality of loci; normalizing the sequence count for each of the plurality of loci to the sequence count of the positive control locus; and identifying a trait associated with the DNA sample by applying a classification algorithm to the normalized sequence counts. 2. The method of claim 1 , wherein comparing each of the plurality of sequence reads to the plurality of reference sequences comprises determining whether each of the plurality of sequence reads sufficiently aligns with any of the plurality of reference sequences. 3. The method of claim 1 , wherein comparing each of the plurality of sequence reads to the plurality of reference sequences comprises determining whether each of the plurality of sequence reads exactly matches any of the plurality of reference sequences. 4. The method of claim 1 , wherein the multiplex PCR uses unlabeled primers. 5. The method of claim 1 , further comprising removing amplicons having a length that is outside a predetermined range prior to sequencing the plurality of amplicons. 6. The method of claim 5 , wherein the predetermined range is about 50 base pairs to about 500 base pairs. 7. The method of claim 1 , wherein applying the classification algorithm comprises applying a k-Nearest Neighbor (k-NN) algorithm. 8. The method of claim 7 , wherein applying the k-NN algorithm comprises computing un-weighted Euclidean distances of normalized sequence counts between the DNA sample and a plurality of reference samples associated with different traits. 9. The method of claim 1 , further comprising: labeling each of the plurality of amplicons with a unique nucleotide index; mixing the plurality of amplicons with additional amplicons that have each been labeled with a unique nucleotide index; and sequencing the additional amplicons using the MPS instrument at the same time as sequencing the plurality of amplicons. 10. The method of claim 9 , wherein the additional amplicons contain short tandem repeats that are used to allelotype the DNA sample. 11. The method of claim 9 , wherein the additional amplicons contain single nucleotide polymorphisms that are used to allelotype the DNA sample. 12. The method of claim 1 , wherein at least one of the plurality of loci is a negative control locus that is substantially digested by the methylation-dependent endonuclease irrespective of the trait associated with the DNA sample. 13. The method of claim 12 , wherein the negative control locus is a 94 base pair amplicon located on chromosome 7 between SEQ ID NO: 17 and SEQ ID NO: 18. 14. The method of claim 1 , wherein the methylation-dependent endonuclease is Hha1. 15. The method of claim 1 , wherein the plurality of loci comprise one or more of: a 66 base pair amplicon located on chromosome 12 between SEQ ID NO: 1 and SEQ ID NO: 2, a 70 base pair amplicon located on chromosome 3 between SEQ ID NO: 3 and SEQ ID NO: 4, a 75 base pair amplicon located on chromosome 3 between SEQ ID NO: 5 and SEQ ID NO: 6, a 75 base pair amplicon located on chromosome 22 between SEQ ID NO: 7 and SEQ ID NO: 8, an 80 base pair amplicon located on chromosome 1 between SEQ ID NO: 9 and SEQ ID NO: 10, an 89 base pair amplicon located on chromosome 19 between SEQ ID NO: 11 and SEQ ID NO: 12, a 100 base pair amplicon located on chromosome 19 between SEQ ID NO: 13 and SEQ ID NO: 14, and a 130 base pair amplicon located on chromosome 19 between SEQ ID NO: 15 and SEQ ID NO: 16. 16. The method of claim 1 , wherein the plurality of loci consist of: a 66 base pair amplicon located on chromosome 12 between SEQ ID NO: 1 and SEQ ID NO: 2, a 70 base pair amplicon located on chromosome 3 between SEQ ID NO: 3 and SEQ ID NO: 4, a 75 base pair amplicon located on chromosome 3 between SEQ ID NO: 5 and SEQ ID NO: 6, a 75 base pair amplicon located on chromosome 22 between SEQ ID NO: 7 and SEQ ID NO: 8, an 80 base pair amplicon located on chromosome 1 between SEQ ID NO: 9 and SEQ ID NO: 10, an 89 base pair amplicon located on chromosome 19 between SEQ ID NO: 11 and SEQ ID NO: 12, a 100 base pair amplicon located on chromosome 19 between SEQ ID NO: 13 and SEQ ID NO: 14, a 130 base pair amplicon located on chromosome 19 between SEQ ID NO: 15 and SEQ ID NO: 16, and a 94 base pair region located on chromosome 7 between SEQ ID NO: 17 and SEQ ID NO: 18. 17. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a tissue source from which the DNA sample was derived. 18. The method of claim 17 , wherein identifying the tissue source from which the DNA sample was derived comprises determining whether the tissue source is blood, skin, saliva, or semen. 19. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a cell type from which the DNA sample was derived. 20. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an age of an organism from which the DNA sample was derived. 21. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a disease state or risk of disease of an organism from which the DNA sample was derived. 22. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a response to environmental signals of an organism from which the DNA sample was derived. 23. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a body mass index or an obesity state of an organism from which the DNA sample was derived. 24. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an expression level of one or more genes in an organism from which the DNA sample was derived. 25. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a physical characteristic of an organism from which the DNA sample was derived. 26. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying a drug response of an organism from which the DNA sample was derived. 27. The method of claim 1 , wherein identifying the trait associated with the DNA sample comprises identifying an epigenetic inheritance of an organism from which the DNA sample was derived. 28. The method of claim 1 , wherein identifying the trait associated with

Assignees

Inventors

Classifications

  • Methods for sequencing · CPC title

  • G16B40/00Primary

    ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title

  • G16B20/00Primary

    ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations · CPC title

  • Signal processing, e.g. from mass spectrometry [MS] or from PCR · CPC title

  • Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10490299B2 cover?
Illustrative embodiments of systems and methods for the identification of traits associated with DNA samples using epigenetic-based patterns detected via massively parallel sequencing (MPS) are disclosed. Illustrative embodiments may involve digesting a DNA sample with a methylation-dependent endonuclease, amplifying loci of the digested DNA sample (including a positive control locus that does …
Who is the assignee on this patent?
Battelle Memorial Institute
What technology area does this patent fall under?
Primary CPC classification G16B40/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).