Methods of lowering the error rate of massively parallel DNA sequencing using duplex consensus sequencing

US12258629B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12258629-B2
Application numberUS-201916503398-A
CountryUS
Kind codeB2
Filing dateJul 3, 2019
Priority dateMar 20, 2012
Publication dateMar 25, 2025
Grant dateMar 25, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Next Generation DNA sequencing promises to revolutionize clinical medicine and basic research. However, while this technology has the capacity to generate hundreds of billions of nucleotides of DNA sequence in a single experiment, the error rate of approximately 1% results in hundreds of millions of sequencing mistakes. These scattered errors can be tolerated in some applications but become extremely problematic when “deep sequencing” genetically heterogeneous mixtures, such as tumors or mixed microbial populations. To overcome limitations in sequencing accuracy, a method Duplex Consensus Sequencing (DCS) is provided. This approach greatly reduces errors by independently tagging and sequencing each of the two strands of a DNA duplex. As the two strands are complementary, true mutations are found at the same position in both strands. In contrast, PCR or sequencing errors will result in errors in only one strand. This method uniquely capitalizes on the redundant information stored in double-stranded DNA, thus overcoming technical limitations of prior methods utilizing data from only one of the two strands.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of sequencing DNA comprising: preparing a sequencing library from a sample comprising a plurality of double-stranded DNA fragments, wherein preparing the sequence library comprises ligating adapter molecules to the plurality of double-stranded DNA fragments to generate a plurality of double-stranded adapter-DNA molecules; sequencing first and second strands of the adapter-DNA molecules to provide a first strand sequence read derived from an original first strand and a distinct, yet related second strand sequence read derived from an original second strand for individual adapter-DNA molecules in the plurality; for at least some of the adapter-DNA molecules: comparing the first strand sequence read with the distinct, yet related second strand sequence read; and generating a consensus sequence of the double-stranded DNA fragment wherein each position in the consensus sequence is identified as correct if the particular position in the first strand sequence read and the particular position in the distinct, yet related second strand sequence read is in agreement. 2. The method of claim 1 , wherein nucleotide bases that are in agreement between the first strand sequence read and the distinct, yet related second strand sequence read are not present in a reference sequence, and wherein said nucleotide bases are identified as a true variant. 3. The method of claim 1 , wherein a nucleotide base that is not in agreement between the first strand sequence read and the distinct, yet related second strand sequence read is identified as a potential artifact. 4. The method of claim 1 , wherein prior to sequencing, the method further comprises amplifying each original strand of the adapter-DNA molecules, resulting in a set of copies of the original first strand of each adapter-DNA molecule and a set of copies of the original second strand of each adapter-DNA molecule, and wherein the sequencing step further comprises sequencing the set of copies of the original first strand and the set of copies of the original second strand to generate a set of first strand sequence reads and a set of distinct, yet related second strand sequence reads. 5. The method of claim 4 , wherein a nucleotide base is not in agreement between the first strand sequence read and the distinct, yet related second strand sequence read, and wherein the method further comprises identifying the internal consistency of base calls among the set of sequence reads derived from each single original strand, and wherein a non-consistent nucleotide base is characterized as (i) a processing error if the base call is not consistent among the set of sequence reads derived from the single original strand, and the non-consistent nucleotide base is characterized as (ii) a site of DNA damage if the base call is consistent among the set of sequence reads derived from the single original strand. 6. The method of claim 1 , wherein prior to comparing the first strand sequence read and the second strand sequence read, the method comprises associating the first strand sequence read with the second strand sequence read using one or more of an adapter sequence, sequence read length, and original strand information. 7. The method of claim 1 , further comprising identifying a mutation occurring at a particular position or region in the consensus sequence as a true mutation. 8. The method of claim 1 , wherein the sample is or comprises double-stranded DNA fragments from a cancer tissue, healthy tissue, or a combination thereof. 9. The method of claim 1 , wherein the plurality of double-stranded adapter-DNA molecules comprise a nucleotide sequence asymmetry such that the original second strand generates the distinct, yet related second strand sequence read when compared to the first strand sequence read. 10. The method of claim 1 , wherein the adapter molecules further comprise a nucleotide sequence that differentiates each original strand relative to its original complementary strand. 11. The method of claim 1 , further comprising comparing the consensus sequence with a reference sequence and identifying a variant occurring at a particular position in the consensus sequence. 12. The method of claim 11 , further comprising identifying the double-stranded DNA fragments as being derived from a neoplastic cell by the variant. 13. The method of claim 1 , further comprising aligning the consensus sequence with a reference sequence and identifying a nucleotide sequence occurring at a particular position or region in the consensus sequence as a true nucleotide sequence. 14. The method of claim 1 , wherein the plurality of double-stranded DNA fragments are circulating nucleic acid molecules. 15. The method of claim 14 , wherein the circulating nucleic acid molecules comprise nucleic acid-based biomarkers from serum or plasma. 16. The method of claim 1 , wherein the sample is a biological sample from a subject, and wherein the method further comprises detecting a cancer, a cancer risk, a cancer metabolic state, or a mutator phenotype in the subject by detecting a cancer-associated nucleic acid-based serum biomarker in the subject. 17. The method of claim 1 , wherein the sample comprises a blood sample or a tissue sample from a subject. 18. A method of sequencing DNA comprising: ligating adapter molecules comprising a region of non-complementarity to a plurality of target double-stranded DNA fragments to generate a plurality of double-stranded adapter-DNA molecules; amplifying an original first strand and an original second strand of at least some of the double-stranded adapter-DNA molecules, resulting in a set of copies of the original first strand and a set of copies of the original second strand; for the at least some of the adapter-DNA molecules: sequencing the set of copies of the original first strand and the set of copies of the original second strand to generate a set of first strand sequence reads and a set of second strand sequence reads that are distinguishable from the set of first strand sequence reads by the region of non-complementarity; and confirming the presence of at least one sequence read derived from each of the original first strand and the original second strand of the adapter-DNA molecule; comparing the first strand sequence read with the second strand sequence read; and generating a consensus sequence of the target double-stranded DNA fragment wherein each position in the consensus sequence is identified as correct if the particular position in the first strand sequence read and the second strand sequence read is in agreement. 19. The method of claim 18 , wherein prior to amplifying an original first strand and an original second strand of at least some of the double-stranded adapter-DNA molecules, the method further comprises selectively enriching the double-stranded adapter-DNA molecules to generate a subset of double-stranded adapter-DNA molecules for sequencing. 20. The method of claim 19 , wherein selectively enriching the adapter-DNA molecules comprises selecting the adapter-DNA molecules based on a strand size. 21. The method of claim 19 , wherein selectively enriching the adapter-DNA molecules occurs via primer extension capture or target DNA amplification. 22. The method of claim 19 , wherein selectively enriching the adapter-DNA molecules comprises amplification of select adapter-DNA molecules with one or more primers specific to each genetic locus in a set of genetic loci among a larger set o

Assignees

Inventors

Classifications

  • C12Q1/6869Primary

    Methods for sequencing · CPC title

  • Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay (C12Q1/6804 takes precedence) · CPC title

  • C12Q1/6876Primary

    Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes · CPC title

  • characterised by the use of the arrayed oligonucleotides as identifier tags, e.g. universal addressable array, anti-tag or tag complement array · CPC title

  • the label being a nucleic acid · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12258629B2 cover?
Next Generation DNA sequencing promises to revolutionize clinical medicine and basic research. However, while this technology has the capacity to generate hundreds of billions of nucleotides of DNA sequence in a single experiment, the error rate of approximately 1% results in hundreds of millions of sequencing mistakes. These scattered errors can be tolerated in some applications but become ext…
Who is the assignee on this patent?
Univ Washington Through Its Center For Commercialization
What technology area does this patent fall under?
Primary CPC classification C12Q1/6869. Mapped technology areas include Chemistry & Metallurgy.
When was this patent published?
Publication date Tue Mar 25 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).