Compositions and methods for accurately identifying mutations
US-2024409996-A1 · Dec 12, 2024 · US
US9708653B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9708653-B2 |
| Application number | US-201314379128-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 15, 2013 |
| Priority date | Feb 15, 2012 |
| Publication date | Jul 18, 2017 |
| Grant date | Jul 18, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Short Tandem Repeats are currently used by law enforcement and others, for example, for the identification of individuals by DNA matching. A method is described herein that uses WPD to classify and identify repeating sequences in nucleotide sequences from the position and frequency information contained within nucleotide sequences. This decomposition allows for the quick classification of nucleotide sequences (i.e., reads) into two different classes, including, for example, one class that contains sequencer reads that contain a repeat motif with non-repeat sequence on either flank, and another class that contains sequencer reads that do not contain any repeat sequence.
Opening claim text (preview).
What is claimed is: 1. A method for identifying repeating sequences in a target nucleic acid comprising repeating sequences and non-repeating sequences, the method comprising the steps of sequencing the target nucleic acid to obtain sequence data; digitizing, with one or more processors, the sequence data; applying, with the one or more processors, wavelet packet decomposition (WPD) to decompose the digitized sequence data into non-periodic signal data and periodic signal data comprising coefficients; classifying, with the one or more processors, the non-periodic signal data into a non-repeat bin and the periodic signal data into a repeat bin based upon the coefficients; and identifying the repeating sequences in the target nucleic acid by matching, with the one or more processors, only the coefficients from the periodic signal data in the repeat bin to reference coefficients generated from WPD of a reference sequence. 2. The method of claim 1 wherein the repeating sequences comprise different loci. 3. The method of claim 2 wherein different repeating sequence alleles at a locus have distinguishable coefficients and the distinguishable coefficients allow the different repeating sequence alleles to be distinguished from each other and from other short tandem repeat (STR) loci. 4. The method of claim 1 wherein the target nucleic acid comprises more than one allele for each of the repeating sequences in the target nucleic acid. 5. The method of claim 4 wherein the different alleles have distinguishable coefficients and wherein the distinguishable coefficients allow the different alleles to be distinguished from each other. 6. The method of claim 1 wherein the method provides information selected from the group consisting of the location, the frequency, and the length of the repeating sequences, or a combination thereof, in the target nucleic acid. 7. The method of claim 1 wherein one or more repeating sequences identified are compared to sequence data in a DNA database. 8. The method of claim 7 wherein the database is a national government DNA database. 9. The method of claim 8 wherein the database is selected from the group consisting of CODIS, NDNAD, and FNAEG. 10. The method of claim 9 wherein the database is CODIS. 11. The method of claim 9 wherein the database is NDNAD. 12. The method of claim 7 wherein the database is GenBank, EMBL, DDBJ or a similar government database, or a custom sequence database. 13. The method of claim 7 wherein the repeating sequences are short tandem repeats (STRs) and the sequence data in the DNA database comprises short tandem repeat (STR) loci. 14. The method of claim 7 wherein the repeating sequences are short tandem repeats (STRs) and the sequence data in the DNA database comprises about 7 to about 16 short tandem repeat (STR) loci. 15. The method of claim 1 wherein the reference sequence comprises short tandem repeats (STRs). 16. The method of claim 15 wherein the reference sequence further comprises an allele for each short tandem repeat (STR). 17. The method of claim 1 wherein the target nucleic acid is a DNA or RNA. 18. The method of claim 1 wherein the repeating sequences are tandem repeats. 19. The method of claim 1 wherein applying WPD comprises recursively applying, with the one or more processors, low-pass and high-pass quadrature mirror filters to the digitized sequence data. 20. The method of claim 1 wherein classifying the non-periodic signal data into a non-repeat bin and the periodic signal data into a repeat bin based upon the coefficients comprises determining whether particular data is the non-periodic signal data or periodic signal data by comparing, with the one or more processors, a maximum coefficient from among the coefficients to a threshold value.
Methods for sequencing · CPC title
Methods for determination or identification of nucleic acids involving differential detection · CPC title
Physics · mapped topic
Physics · mapped topic
Mathematical modelling, e.g. logarithm, ratio · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.