Universal short adapters with variable length non-random unique molecular identifiers
US-2019085384-A1 · Mar 21, 2019 · US
US10844428B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10844428-B2 |
| Application number | US-201615130668-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 15, 2016 |
| Priority date | Apr 28, 2015 |
| Publication date | Nov 24, 2020 |
| Grant date | Nov 24, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed embodiments concern methods, apparatus, systems and computer program products for determining sequences of interest using unique molecular index (UMI) sequences that are uniquely associable with individual polynucleotide fragments, including sequences with low allele frequencies and long sequence length. In some implementations, the UMIs include both physical UMIs and virtual UMIs. In some implementations, the unique molecular index sequences include non-random sequences. System, apparatus, and computer program products are also provided for determining a sequence of interest implementing the methods disclosed.
Opening claim text (preview).
What is claimed is: 1. A method for sequencing nucleic acid molecules from a sample using unique molecular indices (UMIs), wherein each unique molecular index (UMI) is an oligonucleotide sequence that can be used to identify an individual molecule of a double-stranded DNA fragment in the sample, comprising (a) applying adapters to both ends of a plurality of double-stranded DNA fragments in the sample to obtain DNA-adapter products, wherein each adapter comprises a double-stranded hybridized region, a single-stranded 5′ arm, a single-stranded 3′ arm, and a physical UMI on one strand or each strand of the adapter, the physical UMI being selected from a plurality of physical UMIs, each double-stranded DNA fragment in the sample comprises a virtual UMI on one strand or each strand of the double-stranded DNA fragment, the virtual UMI is a sequence of nucleotides shorter than the double-stranded DNA fragment, the position of the virtual UMI is defined at or with respect to an end of the double-stranded DNA fragment, and the plurality of double-stranded DNA fragments is not obtained by restriction endonuclease digestion; (b) amplifying both strands of the DNA-adapter products to obtain a plurality of amplified polynucleotides; (c) sequencing, using a nucleic acid sequencer, the plurality of amplified polynucleotides, thereby obtaining a plurality of reads each comprising a physical UMI corresponding to a physical UMI on an adapter and a virtual UMI corresponding to a virtual UMI on a double-stranded DNA fragment in the sample; (d) identifying a plurality of physical UMI sequences for the plurality of reads; (e) identifying a plurality of virtual UMI sequences for the plurality of reads; and (f) determining sequences of the plurality of double-stranded DNA fragments in the sample by: (i) grouping the plurality of reads based at least on the plurality of virtual UMI sequences to obtain a plurality of groups of reads, (ii) determining a plurality of consensus nucleotide sequences using the plurality of groups of reads, and (iii) determining the sequences of the plurality of double-stranded DNA fragments using the plurality of consensus nucleotide sequences. 2. The method of claim 1 , wherein (f)(i) comprises: grouping the plurality of reads based at least on the plurality of virtual UMI sequences and the plurality of physical UMI sequences in the reads to obtain the plurality of groups of reads, each group having a unique combination of a virtual UMI sequence and a physical UMI sequence. 3. The method of claim 1 , wherein the plurality of physical UMIs comprises random UMIs. 4. The method of claim 1 , wherein the plurality of physical UMIs comprises nonrandom UMIs. 5. The method of claim 4 , wherein every nonrandom UMI differs from every other nonrandom UMI of the adapters by at least two nucleotides at corresponding sequence positions of the nonrandom UMIs. 6. The method of claim 5 , wherein the plurality of physical UMIs includes no more than about 10,000 unique nonrandom UMIs. 7. The method of claim 6 , wherein the plurality of physical UMIs includes no more than about 1,000 unique nonrandom UMIs. 8. The method of claim 7 , wherein the plurality of physical UMIs includes no more than about 500 unique nonrandom UMIs. 9. The method of claim 8 , wherein the plurality of physical UMIs includes no more than about 100 unique nonrandom UMIs. 10. The method of claim 9 , wherein the plurality of physical UMIs includes about 96 unique nonrandom UMIs. 11. The method of claim 1 , wherein applying adapters to both ends of double-stranded DNA fragments comprises ligating the adapters to both ends of the double-stranded DNA fragments. 12. The method of claim 1 , wherein the plurality of physical UMIs includes fewer than 12 nucleotides. 13. The method of claim 12 , wherein the plurality of physical UMIs includes no more than 6 nucleotides. 14. The method of claim 12 , wherein the plurality of physical UMIs includes no more than 4 nucleotides. 15. The method of claim 1 , wherein the adapters each comprise a physical UMI on each strand of the adapters in the double-stranded hybridized region. 16. The method of claim 15 , wherein the physical UMI is at or near an end of the double-stranded hybridized region, said end of the double-stranded hybridized region being opposite from the 3′ arm or the 5′ arm. 17. The method of claim 16 , wherein the physical UMI is at said end of the double-stranded hybridized region, or is one nucleotide away from said end of the double-stranded hybridized region. 18. The method of claim 17 , wherein the adapters each comprise a 5′-TGG-3′ trinucleotide or a 3′-ACC-5′ trinucleotide on the double-stranded hybridized region adjacent to a physical UMI. 19. The method of claim 18 , wherein the adapters each comprise a read primer sequence on each strand of the double-stranded hybridized region. 20. The method of claim 1 , wherein the adapters each comprise a physical UMI on only one strand of the adapters on the single-stranded 5′ arm or the single-stranded 3′ arm. 21. The method of claim 20 , wherein (f) comprises: (i) collapsing reads having a same first physical UMI sequence into a first group to obtain a first consensus nucleotide sequence; (ii) collapsing reads having a same second physical UMI sequence into a second group to obtain a second consensus nucleotide sequence; and (iii) determining, using the first and second consensus nucleotide sequences, a sequence of one of the double-stranded DNA fragments in the sample. 22. The method of claim 21 , wherein (iii) comprises: (1) obtaining, using localization information and sequence information of the first and second consensus nucleotide sequences, a third consensus nucleotide sequence, and (2) determining, using the third consensus nucleotide sequence, the sequence of one of the double-stranded DNA fragments. 23. The method of claim 20 , wherein (e) comprises identifying the plurality of virtual UMI sequences, while the adapters each comprise the physical UMI on only the single-stranded 5′ arm or the single-stranded 3′ arm. 24. The method of claim 23 , wherein (f) comprises: (i) combining reads having a first physical UMI sequence and at least one virtual UMI sequence in a read direction and reads having a second physical UMI sequence and the at least one virtual UMI sequence in the read direction to determine a consensus nucleotide sequence; and (ii) determining a sequence of one of the double-stranded DNA fragments in the sample using the consensus nucleotide sequence. 25. The method of claim 1 , wherein the adapters each comprise a physical UMI on each strand of the adapters in a double-stranded region of the adapters, wherein the physical UMI on one strand is complementary to the physical UMI on the other strand. 26. The method of claim 25 , wherein (f) comprises: (i) combining reads having a first physical UMI sequence, at least one virtual UMI sequence, and a second physical UMI sequence in the 5′ to 3′ direction and reads having the second physical UMI sequence, the at least one virtual UMI sequence, and the first physical UMI sequence in the 5′ to 3′ direction to determine a consensus nucleotide sequence; and (ii) determining a sequence of one of the double-stranded DNA fragments in the sample using the consensus nucleotide sequence. 27. The method of claim 1 , wherein the adapters eac
Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags · CPC title
Sequence alignment; Homology search · CPC title
ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title
Methods for sequencing · CPC title
Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay (C12Q1/6804 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.