Long fragment de novo assembly using short reads
US-2015057947-A1 · Feb 26, 2015 · US
US2023326064A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023326064-A1 |
| Application number | US-202218078797-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 9, 2022 |
| Priority date | Aug 31, 2020 |
| Publication date | Oct 12, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Image data analysis, particularly identifying cluster locations for performing base-calling in a digital flow cell image during DNA sequencing, is described. Each nucleic acid template molecule immobilized on a support may include an insert sequence and a sample index sequence. The sample index sequence may include a k-mer sequence. A sequencing system may conduct k cycles of sequencing reactions of the k-mer sequence before conducting one or more cycles of the insert sequence sequencing reactions and generate a first plurality of flow cell images. Pixel intensities may be determined for pixels of the first plurality of flow cell images. A base calling template may be determined and include base calling locations based on the pixel intensities and respective color purities of the pixel intensities. The base calling template may register a second plurality of flow cell images of the support in one or more cycles subsequent to the k cycles.
Opening claim text (preview).
1 .- 101 . (canceled) 102 . A method comprising: providing a first plurality of library molecules immobilized on a support, wherein each of the first plurality of library molecules comprise: a first insert sequence derived from a first sample source and a first sample index sequence, wherein the first sample index sequence comprises a first k-mer sequence and a first universal sample index sequence, the first universal sample index identifying the first sample source of the first insert sequence; providing a second plurality of library molecules immobilized on the support, wherein each of the second plurality of library molecules comprise: a second insert sequence derived from a second sample source and a second sample index sequence, wherein the second sample index sequence comprises a second k-mer sequence and a second universal sample index sequence, the second universal sample index identifying the second sample source of the second insert sequence; conducting, by a sequencing system, k cycles of sequencing reactions of the first and second k-mer sequences, thereby generating a first plurality of flow cell images; determining, by a processor, for pixels of the first plurality of flow cell images, pixel intensities and a respective color purity of each of the pixel intensities; and determining, by the processor and before conducting one or more cycles of the sequencing reactions of the first or second insert sequence, a base calling template comprising base calling locations based on the pixel intensities and the respective color purity of the pixel intensities, wherein the base calling template is configured for registering a second plurality of flow cell images of the support in one or more cycles subsequent to the k cycles. 103 . The method of claim 102 , further comprising: pooling the first and second plurality of library molecules; and distributing the pooled library molecules onto the support and conducting an amplification reaction to generate a plurality of nucleic acid template molecules immobilized to the support, wherein the plurality of nucleic acid template molecules are clonally amplified from the first library molecules and the second library molecules. 104 . The method of claim 102 , wherein one or more of: determining, by a processor, for pixels of the first plurality of flow cell image, pixel intensities and a respective color purity of each of the pixel intensities; and determining, by the processor, a base calling template comprising base calling locations based on the pixel intensities and the respective color purity of the pixel intensities is before conducting any cycles of sequencing reactions of: the first sample index sequence; the first universal sample index sequence; the second sample index sequence; and the second universal sample index sequence. 105 . The method of claim 102 , wherein conducting the k cycles of sequencing reactions of the k-mer sequence and of a base position of the first universal sample index sequence is based on an order of sequencing of a sequencing run. 106 . The method of claim 105 , wherein the order of sequencing comprises: sequencing the k-mer sequence; then sequencing the first and second universal sample index sequences; and then sequencing the first and second insert sequences. 107 . The method of claim 102 , wherein the first or second plurality of flow cell images are from 2, 3, or 4 different color channels. 108 . The method of claim 102 , wherein the first plurality of flow cell images from k cycles comprises a balanced diversity of nucleotide bases of A, G, C and T/U among the plurality of nucleic acid template molecules immobilized on the support in each of the k cycles. 109 . The method of claim 102 , wherein the k-mer sequence comprises a random sequence of at least 2 or 3 nucleotide bases of A, G, C and T/U. 110 . The method of claim 102 , wherein the support is comprised in a flow cell device. 111 . The method of claim 103 , wherein a density of the nucleic acid template molecules on the support is 10 4 -10 12 per mm 2 . 112 . The method of claim 102 , wherein conducting the k cycles of the sequencing reactions of the k-mer sequence comprises: contacting polonies of nucleotide acid template molecules with a plurality of sequencing primers, a plurality of polymerases, and a mixture of different types of avidites, wherein each of the plurality of nucleic acid template molecules immobilized on the support corresponds to a polony. 113 . The method of claim 102 , wherein conducting k cycles of the sequencing reactions of the k-mer sequence comprises: in each of the k cycles, acquiring, by an optical system, the first plurality of flow cell images comprising optical color signals emitted from the nucleotide reagents that are bound to the template molecules. 114 . The method of claim 102 , wherein k is an integer that is greater than 0 and less than 10. 115 . The method of claim 103 , wherein each of the base calling locations corresponds to a location of the plurality of immobilized template molecules. 116 . The method of claim 102 , wherein the second plurality of flow cell images comprises optical signals emitted from nucleotide reagents bound to a unbalanced diversity of nucleotide bases of A, G, C and T/U among the plurality of nucleic acid template molecules immobilized on the support in the one or more cycles subsequent to the k cycles. 117 . The method of claim 116 , wherein the unbalanced diversity of nucleotide bases of A, G, C and T/U among the plurality of nucleic acid template molecules comprises: a percentage of (1) a number of at least one type of nucleotide bases to (2) a total number of bases is less than 20%, 15%, 10%, or 5% in the one or more cycles. 118 . The method of claim 102 , further comprising: registering, by the processor, the second plurality of flow cell images from the one or more subsequent flow cycles to the base calling template; and perform, by the processor, base calling of the second plurality of flow cell images at the base calling locations in the base calling template using signals from the registered second plurality of flow cell images. 119 . The method of claim 118 , wherein registering the second plurality of flow cell images from the one or more subsequent flow cycles to the base calling template comprises: generating coordinates of polonies in the second plurality of flow cell images in a common coordinate system as the base calling template. 120 . A system comprising: one or more hardware processors; one or more data storage devices storing instructions executable by the one or more hardware processors that, when executed, cause the one or more hardware processors to perform operations, the operations comprising: providing a first plurality of library molecules immobilized on a support, wherein each of the first plurality of library molecules comprise: a first insert sequence derived from a first sample source and a first sample index sequence, wherein the first sample index sequence comprises a first k-mer sequence and a first universal sample index sequence, the first universal sample index identifying the first sample source of the first insert sequence; providing a second plurality of library molecules immobilized on the support, wherein each of the second plurality of library molecules comprise: a second insert sequence derived from a second sample source and a second sample index sequence, wherein the second sample index sequence comprises a seco
of image moments or centre of gravity · CPC title
Determination of transform parameters for the alignment of images, i.e. image registration · CPC title
Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching · CPC title
Acquisition · CPC title
Preprocessing, e.g. image segmentation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.