Methods and systems for obtaining a single molecule consensus sequence

US9600626B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9600626-B2
Application numberUS-201615078895-A
CountryUS
Kind codeB2
Filing dateMar 23, 2016
Priority dateMar 28, 2008
Publication dateMar 21, 2017
Grant dateMar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer systems, computer readable media, and computer methods for obtaining, calling, and assembling nucleic acid sequences are presented. In some aspects the invention includes the sequencing of template constructs that comprise double stranded portions in partially contiguous constructs, to provide for single molecule consensus sequence determination through one or both of sequencing sense and antisense strands in the same molecule.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for determining a single molecule consensus sequence of a target nucleic acid, the method comprising: performing a single molecule nucleic acid sequencing of a strand of a single target nucleic acid molecule, thereby obtaining data, wherein the strand of the single target nucleic acid molecule comprises a first region comprising a first nucleic acid sequence connected through a linking nucleic acid segment to a second region, the second region comprising a second nucleic acid sequence, the first nucleic acid sequence being the complement of the second nucleic acid sequence, and wherein the data comprises sequence data from the first region, the linking nucleic acid segment, and the second region; and assembling the single molecule consensus sequence of the target nucleic acid using the data. 2. The method of claim 1 wherein the assembling is carried out using a hidden Markov model. 3. The method of claim 1 wherein the assembling makes use of local or global measures of the individual sequence qualities, sequence context, or characteristic features of the data. 4. The method of claim 1 , wherein said performing comprises using an electrochemical system to perform the single molecule nucleic acid sequencing. 5. The method of claim 1 , wherein said performing comprises using a nanopore sensor to perform the single molecule nucleic acid sequencing. 6. The method of claim 1 , wherein said target nucleic acid comprises genomic DNA. 7. The method of claim 1 , wherein the first region comprises at least 500 base pairs. 8. The method of claim 1 , wherein said linking nucleic acid segment comprises a registration sequence. 9. The method of claim 8 , wherein said registration sequence is used for alignment of the complement of the called first nucleic acid sequence and the called second nucleic acid sequence. 10. The method of claim 1 , wherein said linking nucleic acid segment comprises a barcode sequence. 11. The method of claim 1 , wherein the linking nucleic acid segment comprises a hairpin loop and wherein the method further comprising, prior to the performing, ligating the hairpin loop to a double-stranded nucleic acid molecule to produce the single target nucleic acid molecule. 12. The method of claim 1 , the method further comprising: subjecting a first portion of the data to a pulse recognition process to obtain pulse data; and invoking a base calling process that uses the pulse data to call the first nucleic acid sequence, wherein the first nucleic acid sequence is used in the assembling to assemble the consensus sequence of the target nucleic acid. 13. The method of claim 12 wherein the first portion of data and the second portion of data are obtained from the same single physical location of the target nucleic acid. 14. The method of claim 1 , wherein the data from the single target nucleic acid molecule comprises an amplitude pattern that provides sequence information of the target nucleic acid. 15. The method of claim 1 , wherein the data from the single target nucleic acid molecule comprises a time-dependent amplitude pattern that provides sequence information of the single target nucleic acid. 16. The method of claim 1 , wherein the data from the single target nucleic acid molecule comprises time-dependent amplitude data originating from a single physical location of the single target nucleic acid molecule, and the method further comprises: calling the first nucleic acid sequence from a first portion of the data by analyzing time-dependent amplitude data in the first portion of the data, and calling the second nucleic acid sequence from a second portion of the data by analyzing time-dependent amplitude data in the second portion of the data, wherein the first nucleic acid sequence and the second nucleic acid sequence is used in the assembling to assemble the consensus sequence of the target nucleic acid. 17. The method of claim 1 , wherein the data from the single target nucleic acid molecule comprises time-dependent amplitude data originating from a single physical location of the target nucleic acid molecule, and the method further comprises: calling the first nucleic acid sequence from a first portion of the data by analyzing amplitude data in the first portion of the data, and calling the second nucleic acid sequence from a second portion of the data by analyzing amplitude data in the second portion of the data, wherein the first nucleic acid sequence and the second nucleic acid sequence is used in the assembling to assemble the consensus sequence of the target nucleic acid. 18. The method of claim 1 , the method further comprising: calling the first nucleic acid sequence of the first region of the target nucleic acid molecule from a first portion of the data; and calling the second nucleic acid sequence of the second region of the target nucleic acid molecule from a second portion of the data; and wherein the assembling comprises assembling the consensus sequence of the target nucleic acid using the data from either (i) an alignment of the called first nucleic acid sequence with the complement of the called second nucleic acid sequence, or (ii) an alignment of the called second nucleic acid sequence with a complement of the called first nucleic acid sequence. 19. The method of claim 18 , wherein the first portion of the data and the second portion of the date originate from the same single physical location of the target nucleic acid. 20. A sequencing system for determining a single molecule consensus sequence of a target nucleic acid, wherein the sequencing system comprises: a reagent system that comprises a target nucleic acid molecule, wherein the target nucleic acid molecule comprises a strand having a first region comprising a first nucleic acid sequence connected through a linking nucleic acid segment to a second region, the second region comprising a second nucleic acid sequence, the first nucleic acid sequence being the complement of the second nucleic acid sequence; and an analytical system that comprises nontransitory instructions for performing single molecule nucleic acid sequencing of the strand of the single target nucleic acid molecule thereby obtaining data, wherein the data comprises sequence data from the first region, the linking nucleic acid segment, and the second region, and wherein the data is assembled to obtain a single molecule consensus sequence of the target nucleic acid. 21. The sequencing system of claim 20 wherein the single molecule consensus sequence is determined by the analytical system using a hidden Markov model. 22. The sequencing system of claim 20 wherein the single molecule consensus sequence is determined by the analytical system using local or global measures of the individual sequence qualities, sequence context, or characteristic features of the data. 23. The sequencing system of claim 20 , wherein said reagent system comprises an electrochemical system. 24. The sequencing system of claim 20 , wherein said reagent system comprises a nanopore sensor. 25. The sequencing system of claim 20 , wherein said target nucleic acid comprises genomic DNA. 26. The sequencing system of claim 20 , wherein the first region comprises at least 500 base pairs. 27. The sequencing system of claim 20 , wherein said linking nucleic acid segment comprises a registration sequence.

Assignees

Inventors

Classifications

  • Circular oligonucleotides · CPC title

  • Methods for sequencing · CPC title

  • G06F19/22Primary

    Physics · mapped topic

  • Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay (C12Q1/6804 takes precedence) · CPC title

  • Hairpin oligonucleotides · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9600626B2 cover?
Computer systems, computer readable media, and computer methods for obtaining, calling, and assembling nucleic acid sequences are presented. In some aspects the invention includes the sequencing of template constructs that comprise double stranded portions in partially contiguous constructs, to provide for single molecule consensus sequence determination through one or both of sequencing sense …
Who is the assignee on this patent?
Pacific Biosciences California Inc
What technology area does this patent fall under?
Primary CPC classification G06F19/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).