What technology area does this patent fall under?

Primary CPC classification C12Q1/6869. Mapped technology areas include Chemistry & Metallurgy.

When was this patent published?

Publication date Thu Jul 19 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Methods and systems for generation and error-correction of unique molecular index sets with heterogeneous molecular lengths

US2018201992A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2018201992-A1
Application number	US-201815863737-A
Country	US
Kind code	A1
Filing date	Jan 5, 2018
Priority date	Jan 18, 2017
Publication date	Jul 19, 2018
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments concern methods, apparatus, systems and computer program products for determining sequences of interest using unique molecular index sequences that are uniquely associable with individual polynucleotide fragments, including sequences with low allele frequencies and long sequence length. In some implementations, the unique molecular index sequences include variable-length nonrandom sequences. In some implementations, the unique molecular index sequences are associated with the individual polynucleotide fragments based on alignment scores indicating similarity between the unique molecular index sequences and subsequences of sequence reads obtained from the individual polynucleotide fragments. System, apparatus, and computer program products are also provided for determining a sequence of interest implementing the methods disclosed.

First claim

Opening claim text (preview).

1 . A method for sequencing nucleic acid molecules from a sample, comprising (a) applying adapters to DNA fragments in the sample to obtain DNA-adapter products, wherein each adapter comprises a nonrandom unique molecular index, and wherein nonrandom unique molecular indices of the adapters have at least two different molecular lengths and form a set of variable-length, nonrandom unique molecular indices (vNRUMIs); (b) amplifying the DNA-adapter products to obtain a plurality of amplified polynucleotides; (c) sequencing the plurality of amplified polynucleotides, thereby obtaining a plurality of reads associated with the set of vNRUMIs; (d) identifying, among the plurality of reads, reads associated with a same variable-length, nonrandom unique molecular index (vNRUMI); and (e) determining a sequence of a DNA fragment in the sample using the reads associated with the same vNRUMI. 2 . The method of claim 1 , wherein identifying the reads associated with the same vNRUMI comprises obtaining, for each read of the plurality of reads, alignment scores with respect to the set of vNRUMIs, each alignment score indicating similarity between a subsequence of a read and a vNRUMI, wherein the subsequence is in a region of the read in which nucleotides derived from the vNRUMI are likely located. 3 . The method of claim 2 , wherein the alignment scores are based on matches of nucleotides and edits of nucleotides between the subsequence of the read and the vNRUMI. 4 . The method of claim 3 , wherein the edits of nucleotides comprise substitutions, additions, and deletions of nucleotides. 5 . The method of claim 3 , wherein each alignment score penalizes mismatches at the beginning of a sequence but does not penalize mismatches at the end of the sequence. 6 . The method of claim 5 , wherein obtaining an alignment score between a read and a vNRUMI comprises: (a) calculating an alignment score between the vNRUMI and each one of all possible prefix sequences of the subsequence of the read; (b) calculating an alignment score between the subsequence of the read and each one of all possible prefix sequences of the vNRUMI; and (c) obtaining a largest alignment score among the alignment scores calculated in (a) and (b) as the alignment score between the read and the vNRUMI. 7 . The method of claim 2 , wherein the subsequence has a length that equals to a length of the longest vNRUMI in the set of vNRUMIs. 8 . The method of claim 2 , wherein identifying the reads associated with the same vNRUMI in (d) further comprises: selecting, for each read of the plurality of reads, at least one vNRUMI from the set of vNRUMIs based on the alignment scores; and associating each read of the plurality of reads with the at least one vNRUMI selected for the read. 9 . The method of claim 8 , wherein selecting the at least one vNRUMI from the set of vNRUMIs comprises selecting a vNRUMI having a highest alignment score among the set of vNRUMIs. 10 . The method of claim 8 , wherein the at least one vNRUMI comprises two or more vNRUMIs. 11 . The method of claim 10 , further comprises selecting one of the two or more vNRUMI as the same vNRUMI of (d) and (e). 12 . The method of claim 1 , wherein the adapters applied in (a) are obtained by: (i) providing a set of oligonucleotide sequences having at least two different molecular lengths; (ii) selecting a subset of oligonucleotide sequences from the set of oligonucleotide sequences, all edit distances between oligonucleotide sequences of the subset of oligonucleotide sequences meeting a threshold value, the subset of oligonucleotide sequences forming the set of vNRUMIs; and (iii) synthesizing the adapters each comprising a double-stranded hybridized region, a single-stranded 5′ arm, a single-stranded 3′ arm, and at least one vNRUMI of the set of vNRUMIs. 13 . The method of claim 12 , wherein the threshold value is 3. 14 . The method of claim 1 , wherein the set of vNRUMIs comprise vNRUMIs of 6 nucleotides and vNRUMIs of 7 nucleotides. 15 . The method of claim 1 , wherein (e) comprises collapsing reads associated with the same vNRUMI into a group to obtain a consensus nucleotide sequence for the sequence of the DNA fragment in the sample. 16 . The method of claim 15 , the consensus nucleotide sequence is obtained based partly on quality scores of the reads. 17 . The method of claim 1 , wherein (e) comprises: identifying, among the reads associated with the same vNRUMI, reads having a same read position or similar read positions in a reference sequence, and determining the sequence of the DNA fragment using reads that (i) are associated with the same vNRUMI and (ii) have the same read position or similar read positions in the reference sequence. 18 - 21 . (canceled) 22 . A method for preparing sequencing adapters, comprising: (a) providing a set of oligonucleotide sequences having at least two different molecular lengths; (b) selecting a subset of oligonucleotide sequences from the set of oligonucleotide sequences, all edit distances between oligonucleotide sequences of the subset of oligonucleotide sequences meeting a threshold value, the subset of oligonucleotide sequences forming a set of variable-length, nonrandom unique molecular indexes (vNRUMIs); and (c) synthesizing a plurality of sequencing adapters, wherein each sequencing adapter comprises a double-stranded hybridized region, a single-stranded 5′ arm, a single-stranded 3′ arm, and at least one vNRUMI of the set of vNRUMIs. 23 - 37 . (canceled) 38 . A method for sequencing nucleic acid molecules from a sample, comprising (a) applying adapters to DNA fragments in the sample to obtain DNA-adapter products, wherein each adapter comprises a unique molecular index (UMI), and wherein unique molecular indices (UMIs) of the adapters have at least two different molecular lengths and form a set of variable-length unique molecular indices (vUMIs); (b) amplifying the DNA-adapter products to obtain a plurality of amplified polynucleotides; (c) sequencing the plurality of amplified polynucleotides, thereby obtaining a plurality of reads associated with the set of vUMIs; and (d) identifying, among the plurality of reads, reads associated with a same variable-length unique molecular index (vUMI). 39 . The method of claim 38 , further comprising determining a sequence of a DNA fragment in the sample using the reads associated with the same vUMI. 40 - 46 . (canceled)

Assignees

Illumina Inc

Inventors

Classifications

C12Q1/6869Primary
Methods for sequencing · CPC title
G16B30/00
ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title
C12Q1/6855Primary
Ligating adaptors · CPC title
C12Q2525/204
specific length of the oligonucleotides · CPC title
C12Q2525/191
incorporating an adaptor · CPC title

Patent family

Related publications grouped by family.

View patent family 61054549

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018201992A1 cover?: The disclosed embodiments concern methods, apparatus, systems and computer program products for determining sequences of interest using unique molecular index sequences that are uniquely associable with individual polynucleotide fragments, including sequences with low allele frequencies and long sequence length. In some implementations, the unique molecular index sequences include variable-leng…
Who is the assignee on this patent?: Illumina Inc
What technology area does this patent fall under?: Primary CPC classification C12Q1/6869. Mapped technology areas include Chemistry & Metallurgy.
When was this patent published?: Publication date Thu Jul 19 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).