Error suppression in sequenced dna fragments using redundant reads with unique molecular indices (umis)
US-2016319345-A1 · Nov 3, 2016 · US
US11708574B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11708574-B2 |
| Application number | US-201715619078-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 9, 2017 |
| Priority date | Jun 10, 2016 |
| Publication date | Jul 25, 2023 |
| Grant date | Jul 25, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
High-fidelity, high-throughput nucleic acid sequencing enables healthcare practitioners and patients to gain insight into genetic variants and potential health risks. However, previous methods of nucleic acid sequencing often introduces sequencing errors (for example, mutations that arise during the preparation of a nucleic acid library, during amplification, or sequencing). Provided herein are sequencing adapters comprising a nondegenerate or variable length molecular barcode and compositions comprising a plurality of sequencing adapters, which can be useful for sequencing nucleic acids. Further provided are methods of using the sequencing adapters, including methods of sequencing nucleic acids, methods of identifying an error in a nucleic acid sequence, and methods of determining the number of nucleic acid molecules in a library.
Opening claim text (preview).
What is claimed is: 1. A method of sequencing a nucleic acid molecule, comprising: (i) mixing a plurality of sequencing adaptors comprising at least a first sequence adaptor and a second sequence adaptor with a duplex nucleic acid molecule; (ii) ligating the first sequence adapter and the second sequence adapter to the duplex nucleic acid molecule prior to amplification of the duplex nucleic acid molecule, wherein the first sequence adapter is a U-shaped sequence adapter or a Y-shaped sequence adapter, and wherein the first sequence adapter comprises a first duplex molecular barcode consisting of n base positions and a predetermined base fraction at one or more base positions across a plurality of duplex molecular barcodes, wherein n is between 8 and 16, wherein the second sequence adapter is a U-shaped sequence adapter or a Y-shaped sequence adapter, and wherein the second sequence adapter comprises a second duplex molecular barcode consisting of n+x base positions and a predetermined base fraction at one or more base positions across the plurality of duplex molecular barcodes, wherein n is between 8 and 16 and x is 1, wherein the second duplex molecular barcode has more base positions than the first duplex molecular barcode, and wherein the sequence adapters in the plurality of sequence adapters each comprise a first constant 3′-overhang comprising a thymine residue directly adjacent to the duplex molecular barcode, and wherein for each duplex molecular barcode other than the first duplex molecular barcode a thymine residue does not immediately precede the first constant 3′-overhang of each sequence adapter; (iii) amplifying a first strand of the duplex nucleic acid molecule; (iv) sequencing a set of amplified first strands formed from the first strand of the duplex nucleic acid molecule, resulting in a set of first strand reads; and (v) constructing a first strand consensus sequence using the set of first strand reads. 2. The method of claim 1 , wherein the plurality of sequencing adapters further comprises a third sequencing adapter comprising a third duplex molecular barcode consisting of n+y base positions, wherein n is between 8 and 16 and y is not zero or x, and wherein the third duplex molecular barcode has more base positions than the first duplex molecular barcode. 3. The method of claim 1 , further comprising compiling the set of first strand reads. 4. The method of claim 3 , wherein the set of first strand reads is compiled based on sequence distance or alignment to a reference sequence. 5. The method of claim 1 , wherein constructing the first strand consensus sequence comprises: comparing the first strand reads in the set of first strand reads; identifying and removing errors in the set of first strand reads; and constructing an error-corrected first-strand consensus sequence. 6. The method of claim 1 , wherein constructing the first strand consensus sequence comprises: constructing a first strand consensus sequence from the set of first strand reads; comparing the first strand consensus sequence to the set of first strand reads; identifying and removing errors in the set of first strand reads; and constructing an error-corrected first strand consensus sequence. 7. The method of claim 1 , wherein the first strand is sequenced in a first direction and a second direction. 8. The method of claim 1 , further comprising: amplifying a second strand of the duplex nucleic acid molecule in the amplification step to form amplified second strands; sequencing a set of amplified second strands formed from a second strand of the duplex nucleic acid molecule, resulting in a set of second strand reads; and constructing a second strand consensus sequence using the set of second strand reads. 9. The method of claim 8 , wherein the second strand is sequenced in a first direction and a second direction. 10. The method of claim 8 , further comprising compiling the set of second strand reads. 11. The method of claim 8 , further comprising compiling the set of second strand reads based on sequence distance or alignment to a reference sequence. 12. The method of claim 8 , further comprising: comparing the first strand consensus sequence and the second strand consensus sequence; identifying and removing errors in the set of first strand reads and the set of second strand reads; and constructing an error-corrected duplex consensus sequence. 13. The method of claim 1 , wherein the duplex nucleic acid molecule comprises a second constant 3′-overhang complementary to the first constant 3′-overhang. 14. The method of claim 1 , wherein the ratio of A/C to G/T nucleotides at any given position of each duplex molecular barcode is between about 2:1 and about 1:2 at the corresponding position relative to the length of the shortest duplex molecular barcode in the plurality of sequencing adapters. 15. The method of claim 1 , wherein the predetermined base fraction comprises between 0.2 and 0.4 for each of adenine, cytosine, thymine, and guanine. 16. The method of claim 1 , wherein each duplex molecular barcode comprises an edit distance of 2 or more from another duplex molecular barcode and wherein said edit distance is the minimum number of single-base changes that two or more sequences must undergo to result in identity between the sequences. 17. The method of claim 1 , wherein each sequencing adapter comprises a sample index nucleic acid sequence, and where: the proportion of adenine within the sample index nucleic acid sequence is between 0.2 and 0.4; the proportion of cytosine within the sample index nucleic acid sequence is between 0.2 and 0.4; the proportion of thymidine within the sample index nucleic acid sequence is between 0.2 and 0.4; and the proportion of guanine within the sample index nucleic acid sequence is between 0.2 and 0.4. 18. The method of claim 1 , wherein the duplex nucleic acid molecule is a cell-free DNA molecule. 19. The method according to claim 1 , wherein the duplex nucleic acid molecule is enriched from a nucleic acid library using a set of capture probes for a region of interest, and wherein the set of capture probes is balanced to provide a reduced sequencing depth variance relative to a set of capture probes comprising a plurality of approximately equally represented capture probes.
General methods of preparing gene libraries, not provided for in other subgroups · CPC title
DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase · CPC title
Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay (C12Q1/6804 takes precedence) · CPC title
Enhancement of hybridisation reaction · CPC title
Methods for sequencing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.