What technology area does this patent fall under?

Primary CPC classification C12N15/1065. Mapped technology areas include Chemistry & Metallurgy.

When was this patent published?

Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Error detection in sequence tag directed subassemblies of short sequencing reads

US10577601B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10577601-B2
Application number	US-201715594476-A
Country	US
Kind code	B2
Filing date	May 12, 2017
Priority date	Sep 12, 2008
Publication date	Mar 3, 2020
Grant date	Mar 3, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention provides methods for preparing DNA sequencing libraries by assembling short read sequencing data into longer contiguous sequences for genome assembly, full length cDNA sequencing, metagenomics, and the analysis of repetitive sequences of assembled genomes.

First claim

Opening claim text (preview).

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows: 1. A method for detecting an error occurring in the preparation and/or sequencing of a DNA sequencing library, the method comprising: (a) incorporating at least one first nucleic acid adaptor molecule into at least one member of a target library comprising a plurality of nucleic acid molecules, wherein the first adaptor molecule comprises a first tag sequence; (b) amplifying the plurality of nucleic acid molecules to produce an input library comprising a first plurality of amplified DNA molecules, wherein the amplified DNA molecules comprise a sequence identical to or complementary to the first tag sequence and a sequence identical to or complementary to at least a portion of the at least one member of the target library; (c) sequencing at least a portion of the plurality of amplified DNA molecules to produce a plurality of sequencing reads corresponding to the at least one member of the target library and comprising a sequence identical to or complementary to the first tag sequence; (d) grouping the plurality of sequencing reads that correspond to the same at least one member of the target library based solely on the commonality of having the first tag sequence or a complement thereof to produce a plurality of grouped sequencing reads; and (e) detecting whether an error exists at a nucleotide position, wherein an error exists when a variation of nucleotide identity exists among the plurality of grouped sequencing reads at a position corresponding to a nucleotide in the at least one member of the target library. 2. The method of claim 1 , wherein the method further comprises determining the correct identity of a nucleotide at the position where the variation of nucleotide identity is detected, wherein the correct identity is determined based on a consensus of individual base calls in the plurality of grouped sequencing reads. 3. The method of claim 2 , wherein the consensus of individual base calls is the most common base call at the nucleotide position in the plurality of grouped sequencing reads. 4. The method of claim 1 , wherein the method further comprises eliminating from further analysis the identity of the nucleotide at the position in a sequencing read where an error is detected. 5. The method of claim 1 , wherein the method further comprises eliminating from further analysis a sequencing read determined to comprise a sequencing error. 6. The method of claim 5 , wherein the sequencing read is determined to comprise a sequencing error when it comprises a nucleotide base call that differs from the consensus nucleotide base call provided by the plurality of grouped sequencing reads. 7. The method of claim 1 , wherein the first tag sequence comprises a unique nucleotide sequence that distinguishes the at least one member of the target library from other members of the target library. 8. The method of claim 1 , further comprising fragmenting at least a portion of the first plurality of amplified DNA molecules in the input library from step (b) to produce a plurality of linear DNA fragments having a first end and a second end. 9. The method of claim 8 , further comprising attaching at least one second nucleic acid adaptor molecule to one or both ends of at least one of the plurality of linear DNA fragments, wherein the second adaptor molecule comprises a defined sequence. 10. The method of claim 9 , further comprising amplifying the plurality of linear DNA fragments to produce a second plurality of amplified DNA molecules, wherein at least one of the second plurality of amplified DNA molecules comprises a sequence identical to or complementary to the first tag sequence, a sequence identical to or complementary to at least a portion of the second adaptor molecule, and a sequence identical to or complementary to at least a portion of a member of the target library. 11. The method of claim 10 , wherein at least a portion of the second plurality of amplified DNA molecules is sequenced in step (c) of the method to produce a plurality of associated sequence reads for each sequenced DNA molecule corresponding to the at least one member of the target library. 12. The method of claim 11 , wherein the associated sequence reads comprise a first sequence read and a second sequence read, wherein the first sequence read comprises the first tag sequence of the first adaptor that uniquely identifies a single nucleic acid member of the target library, and wherein the second sequence read comprises a sequence adjacent to the defined sequence of the second adaptor and represents the sequence adjacent to a fragment breakpoint from the fragmented input library. 13. The method of claim 12 , wherein a plurality of second sequence reads that are each associated with a first sequence read are grouped in step (d) of the method, wherein the first sequence read contains the first tag sequence identifying a single nucleic acid member of the target library. 14. The method of claim 1 , wherein the grouping step (d) comprises generating an alignment of the plurality of sequencing reads. 15. The method of claim 1 , wherein the incorporating at least one first nucleic acid adaptor molecule into at least one member of a target library in step (a) results in at least one circular nucleic acid molecule comprising the first nucleic acid adaptor and the member of a target library. 16. A method for correcting an error occurring in the preparation and/or sequencing of a DNA sequencing library, the method comprising: (a) incorporating at least one first nucleic acid adaptor molecule into at least one member of a target library comprising a plurality of nucleic acid molecules, wherein the first adaptor molecule comprises a first tag sequence; (b) amplifying the plurality of nucleic acid molecules to produce an input library comprising a first plurality of amplified DNA molecules, wherein the amplified DNA molecules comprise a sequence identical to or complementary to the first tag sequence and a sequence identical to or complementary to at least a portion of the at least one member of the target library; (c) sequencing at least a portion of the plurality of amplified DNA molecules to produce a plurality of sequencing reads corresponding to the at least one member of the target library and comprising a sequence identical to or complementary to the first tag sequence; (d) grouping the plurality of sequencing reads that correspond to the at least one member of the target library based solely on the commonality of having the first tag sequence or a complement thereof to produce a plurality of grouped sequencing reads; (e) detecting whether an error exists at a nucleotide position, wherein an error exists when a variation of nucleotide identity exists among the plurality of grouped sequencing reads at a position corresponding to a nucleotide in the at least one member of the target library; and (f) determining a correct identity of the nucleotide at the position where the variation of nucleotide identity is detected, wherein the correct identity is determined based on a consensus of individual base calls in the plurality of grouped sequencing reads. 17. The method of claim 16 , wherein the consensus of individual base calls is the most common base call at the nucleotide position in the plurality of grouped sequencing reads. 18. A method of detecting an error occurring in the preparation and/or sequencing of a DNA sequencing library, the method comprising: (a) grouping a plurality of nucleic acid

Assignees

Univ Washington

Inventors

Classifications

C40B50/06
Biochemical methods, e.g. using enzymes or whole viable microorganisms · CPC title
C12N15/1093
General methods of preparing gene libraries, not provided for in other subgroups · CPC title
C12Q1/6869
Methods for sequencing · CPC title
C12Q1/6874
involving nucleic acid arrays, e.g. sequencing by hybridisation · CPC title
C12N15/66
General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease · CPC title

Patent family

Related publications grouped by family.

View patent family 42007749

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10577601B2 cover?: The invention provides methods for preparing DNA sequencing libraries by assembling short read sequencing data into longer contiguous sequences for genome assembly, full length cDNA sequencing, metagenomics, and the analysis of repetitive sequences of assembled genomes.
Who is the assignee on this patent?: Univ Washington
What technology area does this patent fall under?: Primary CPC classification C12N15/1065. Mapped technology areas include Chemistry & Metallurgy.
When was this patent published?: Publication date Tue Mar 03 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).