Methods and systems for generation and error-correction of unique molecular index sets with heterogeneous molecular lengths

US10844429B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10844429-B2
Application numberUS-201815863737-A
CountryUS
Kind codeB2
Filing dateJan 5, 2018
Priority dateJan 18, 2017
Publication dateNov 24, 2020
Grant dateNov 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments concern methods, apparatus, systems and computer program products for determining sequences of interest using unique molecular index sequences that are uniquely associable with individual polynucleotide fragments, including sequences with low allele frequencies and long sequence length. In some implementations, the unique molecular index sequences include variable-length nonrandom sequences. In some implementations, the unique molecular index sequences are associated with the individual polynucleotide fragments based on alignment scores indicating similarity between the unique molecular index sequences and subsequences of sequence reads obtained from the individual polynucleotide fragments. System, apparatus, and computer program products are also provided for determining a sequence of interest implementing the methods disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for sequencing nucleic acid molecules from a sample, comprising (a) applying adapters to DNA fragments in the sample to obtain DNA-adapter products, wherein each adapter comprises a nonrandom unique molecular index, wherein nonrandom unique molecular indices of the adapters have at least two different molecular lengths and form a set of variable-length, nonrandom unique molecular indices (vNRUMIs), and wherein the adapters are obtained by: (i) providing a set of oligonucleotide sequences having at least two different molecular lengths, (ii) selecting a subset of oligonucleotide sequences from the set of oligonucleotide sequences, all edit distances between oligonucleotide sequences of the subset of oligonucleotide sequences meeting a threshold value, the subset of oligonucleotide sequences forming the set of vNRUMIs, and (iii) synthesizing the adapters each comprising a double-stranded hybridized region, a single-stranded 5′ arm, a single-stranded 3′ arm, and at least one vNRUMI of the set of vNRUMIs; (b) amplifying the DNA-adapter products to obtain a plurality of amplified polynucleotides; (c) sequencing the plurality of amplified polynucleotides, thereby obtaining a plurality of reads associated with the set of vNRUMIs; (d) identifying, among the plurality of reads, reads associated with a same variable-length, nonrandom unique molecular index (vNRUMI); and (e) determining a sequence of a DNA fragment in the sample using the reads associated with the same vNRUMI. 2. The method of claim 1 , wherein the threshold value is 3. 3. The method of claim 1 , wherein the set of vNRUMIs comprise vNRUMIs of 6 nucleotides and vNRUMIs of 7 nucleotides. 4. The method of claim 1 , wherein (e) comprises collapsing reads associated with the same vNRUMI into a group to obtain a consensus nucleotide sequence for the sequence of the DNA fragment in the sample. 5. The method of claim 4 , the consensus nucleotide sequence is obtained based partly on quality scores of the reads. 6. The method of claim 1 , wherein (e) comprises: identifying, among the reads associated with the same vNRUMI, reads having a same read position or similar read positions in a reference sequence, and determining the sequence of the DNA fragment using reads that (i) are associated with the same vNRUMI and (ii) have the same read position or similar read positions in the reference sequence. 7. The method of claim 1 , wherein the set of vNRUMIs includes no more than about 10,000 different vNRUMIs. 8. The method of claim 7 , wherein the set of vNRUMIs includes no more than about 1,000 different vNRUMIs. 9. The method of claim 8 , wherein the set of vNRUMIs includes no more than about 200 different vNRUMIs. 10. The method of claim 1 , applying adapters to the DNA fragments in the sample comprises applying adapters to both ends of the DNA fragments in the sample.

Assignees

Inventors

Classifications

  • specific length of the oligonucleotides · CPC title

  • incorporating an adaptor · CPC title

  • Massive parallel sequencing · CPC title

  • Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation · CPC title

  • G16B30/10Primary

    Sequence alignment; Homology search · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10844429B2 cover?
The disclosed embodiments concern methods, apparatus, systems and computer program products for determining sequences of interest using unique molecular index sequences that are uniquely associable with individual polynucleotide fragments, including sequences with low allele frequencies and long sequence length. In some implementations, the unique molecular index sequences include variable-leng…
Who is the assignee on this patent?
Illumina Inc
What technology area does this patent fall under?
Primary CPC classification G16B30/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).