Methods of lowering the error rate of massively parallel DNA sequencing using duplex consensus sequencing
US-9752188-B2 · Sep 5, 2017 · US
US12006545B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12006545-B2 |
| Application number | US-202117392180-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 2, 2021 |
| Priority date | Mar 20, 2012 |
| Publication date | Jun 11, 2024 |
| Grant date | Jun 11, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Next Generation DNA sequencing promises to revolutionize clinical medicine and basic research. However, while this technology has the capacity to generate hundreds of billions of nucleotides of DNA sequence in a single experiment, the error rate of approximately 1% results in hundreds of millions of sequencing mistakes. These scattered errors can be tolerated in some applications but become extremely problematic when “deep sequencing” genetically heterogeneous mixtures, such as tumors or mixed microbial populations. To overcome limitations in sequencing accuracy, a method Duplex Consensus Sequencing (DCS) is provided. This approach greatly reduces errors by independently tagging and sequencing each of the two strands of a DNA duplex. As the two strands are complementary, true mutations are found at the same position in both strands. In contrast, PCR or sequencing errors will result in errors in only one strand. This method uniquely capitalizes on the redundant information stored in double-stranded DNA, thus overcoming technical limitations of prior methods utilizing data from only one of the two strands.
Opening claim text (preview).
What is claimed is: 1. A method for sequencing a double-stranded nucleic acid molecule, comprising: (a) preparing a sequencing library, wherein preparing a sequencing library comprises: providing a set of hairpin adapters having a double-stranded region and a linker region; ligating the hairpin adapters to a plurality of double-stranded nucleic acid molecules to generate a sequencing library comprising a plurality of adapter-nucleic acid molecule complexes; and transitioning the adapter-nucleic acid molecule complexes from a double-stranded form to a linear single-stranded form, wherein each linear single-stranded adapter-nucleic acid molecule complex comprises at a first strand of a double-stranded nucleic acid molecule and a second strand of the same double-stranded nucleic acid molecule, separated by an adapter sequence; (b) sequencing at least a portion of the linear single-stranded adapter-nucleic acid molecule complexes to obtain a plurality of sequence reads, wherein sequencing comprises cluster amplifying the portion of the linear single-stranded adapter-nucleic acid molecule complexes on a sequencing substrate; (c) grouping the plurality of sequence reads into a plurality of families based at least in part by cluster on the sequencing substrate; and (d) comparing sequence reads within a family to generate a consensus sequence for that family. 2. The method of claim 1 , wherein ligating the hairpin adapters to a plurality of double-stranded nucleic acid molecules comprises ligating hairpin adapters to both ends of the double-stranded nucleic acid molecules, and wherein at least one of the hairpin adapters on at least a portion of the adapter-nucleic acid molecule complexes comprises a cleavage site. 3. The method of claim 2 , wherein the cleavage site is an endonuclease target sequence, and wherein transitioning the adapter-nucleic acid molecule complexes from a double-stranded form to a linear single-stranded form comprises cleaving the endonuclease target sequence with an endonuclease. 4. The method of claim 1 , wherein preparing the sequencing library further comprises providing a set of adapters comprising a Y-shape, and wherein the ligating step further comprises ligating a mix of adapters comprising the Y-shape and the hairpin adapters to the plurality of double-stranded nucleic acid molecules to generate the sequencing library. 5. The method of claim 1 , wherein at least a portion of the adapter-nucleic acid molecule complexes comprises the nucleic acid molecules having an adapter comprising a Y-shape on a first end and a hairpin adapter on a second end. 6. The method of claim 1 , wherein one or more nucleotides in the adapter sequence is an RNA nucleotide, a uracil nucleotide, a modified nucleotide or a non-natural nucleotide. 7. The method of claim 6 , the RNA nucleotide, the uracil nucleotide, the modified nucleotide or the non-natural nucleotide provides a cleavage site, and wherein the method further comprises cleaving the adapter sequence at the cleavage site using a nucleotide-specific nuclease. 8. The method of claim 1 , further comprising amplifying the linear single-stranded adapter-nucleic acid molecule complexes prior to sequencing. 9. The method of claim 1 , further comprising modifying the plurality of double-stranded nucleic acid molecules by performing an end-repairing procedure. 10. The method of claim 9 , further comprising performing an A-tailing or T-tailing procedure prior to ligating the hairpin adapters to the double-stranded nucleic acid molecules. 11. The method of claim 9 , further comprising generating a ligatable end on the double-stranded nucleic acid molecules, wherein the ligatable end comprises a T-overhang, an A-overhang, a CG overhang, a blunt end or a single-stranded sequence complementary to an adapter ligation domain. 12. The method of claim 1 , wherein the hairpin adapters further comprise one or more amplification primer binding sites. 13. The method of claim 1 , wherein the hairpin adapters further comprise one or more sequencing primer binding sites. 14. The method of claim 1 , wherein the consensus sequence comprises a sequence of nucleotide bases, and wherein each nucleotide base is identified at a given position in the consensus sequence when a specific nucleotide is complementary between at least one sequence read of the first strand and at least one sequence read of the second strand. 15. The method of claim 14 , wherein generating a consensus sequence for each of the families further comprises identifying nucleotide positions where the compared sequence read of the first strand and the sequence read of the second strand are non-complementary and scoring the identified non-complementary nucleotide positions as potential artifacts. 16. The method of claim 1 , further comprising loading at least a portion of the sequencing library into a sequencing flow cell and generating a plurality of sequencing clusters on the flow cell, wherein each of the sequencing clusters comprises the first and second strands of an original double-stranded nucleic acid molecule. 17. The method of claim 1 , wherein at least a portion of the hairpin adapters comprise a single molecule identifier (SMI) sequence, and wherein step (c) comprises grouping the plurality of sequence reads into a plurality of families based at least in part on the SMI sequences. 18. The method of claim 1 , wherein prior to step (b), the method further comprises generating amplicons of the adapter-nucleic acid molecule complexes. 19. The method of claim 1 , wherein at least a portion of the hairpin adapters comprise a single molecule identifier (SMI) sequence, and wherein step (c) comprises grouping the plurality of sequence reads into a plurality of families based at least in part on the SMI sequences, and wherein one or more of the families comprise sequence reads from multiple clusters.
Methods for sequencing · CPC title
Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay (C12Q1/6804 takes precedence) · CPC title
Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes · CPC title
incorporating arbitrary or random nucleotide sequences · CPC title
incorporating bases where the precise position of the bases in the nucleic acid string is important · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.