Nucleic acid constructs and methods of use
US-2016024576-A1 · Jan 28, 2016 · US
US9920366B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9920366-B2 |
| Application number | US-201514861989-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 22, 2015 |
| Priority date | Dec 28, 2013 |
| Publication date | Mar 20, 2018 |
| Grant date | Mar 20, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein in are methods and systems for determining genetic variants (e.g., copy number variation) in a polynucleotide sample. A method for determining copy number variations includes tagging double-stranded polynucleotides with duplex tags, sequencing polynucleotides from the sample and estimating total number of polynucleotides mapping to selected genetic loci. The estimate of total number of polynucleotides can involve estimating the number of double-stranded polynucleotides in the original sample for which no sequence reads are generated. This number can be generated using the number of polynucleotides for which reads for both complementary strands are detected and reads for which only one of the two complementary strands is detected.
Opening claim text (preview).
What is claimed is: 1. A method for detecting double-stranded deoxyribonucleic acid (DNA) molecules in a biological sample from a subject, comprising: (a) tagging said double-stranded DNA molecules in said biological sample from said subject with a set of duplex tags, wherein said set of duplex tags comprises a plurality of different molecular barcodes, wherein each duplex tag of said set of duplex tags differently tags complementary strands of a double-stranded DNA molecule of said double-stranded DNA molecules in said biological sample to provide tagged strands, and wherein said tagging is performed with at least a 10X excess of duplex tags as compared to said double-stranded DNA molecules, which excess of duplex tags is sufficient to tag at least 20% of said double-stranded DNA molecules in said biological sample from said subject; (b) for each genetic locus in a set of one or more genetic loci in a reference genome, selectively enriching said tagged strands for a subset of said tagged strands that maps to said genetic locus, to provide enriched tagged strands; (c) sequencing at least a portion of said enriched tagged strands to generate a plurality of raw sequence reads from said biological sample from said subject; (d) grouping said plurality of raw sequence reads into a plurality of families, each family comprising raw sequence reads generated from a same parent polynucleotide, which grouping is based on (i) molecular barcodes associated with said parent polynucleotides and (ii) information from beginning and/or end portions of said raw sequences of said parent polynucleotides; (e) collapsing said plurality of raw sequence reads grouped into said plurality of families into a plurality of consensus sequence reads, each consensus sequence read of said plurality of consensus sequence reads (i) comprising a plurality of consensus bases for each genetic locus in said set of one or more genetic loci and (ii) being representative of single strands of said double-stranded DNA molecules; (f) for each genetic locus in said set of one or more genetic loci, calculating a first quantitative measure of said enriched tagged strands that map to said genetic locus for which complementary strands are detected in said plurality of consensus sequence reads; (g) for each genetic locus in said set of one or more genetic loci, calculating a second quantitative measure of said enriched tagged strands that map to said genetic locus for which only one strand among complementary strands is detected in said plurality of consensus sequence reads; and (h) for each genetic locus in said set of one or more genetic loci, calculating a third quantitative measure of said enriched tagged strands that map to said genetic locus for which neither complementary strand is detected in said plurality of consensus sequence reads, wherein said third quantitative measure is calculated based at least in part on said first and second quantitative measures, thereby detecting said double-stranded DNA molecules in said biological sample from said subject. 2. The method of claim 1 , wherein said biological sample comprises double-stranded DNA molecules sourced substantially from cell-free nucleic acids. 3. The method of claim 1 , further comprising sorting consensus sequence reads into paired reads and unpaired reads, wherein (i) each paired read corresponds to consensus sequence reads generated from a first tagged strand and a second differently tagged complementary strand derived from a double-stranded DNA molecule in said biological sample, and (ii) each unpaired read represents a first tagged strand having no second differently tagged complementary strand derived from a double-stranded DNA molecule represented among said consensus sequence reads in said plurality of consensus sequence reads. 4. The method of claim 3 , further comprising calculating quantitative measures of (i) said paired reads and (ii) said unpaired reads that map to each of said set of one or more genetic loci to determine a quantitative measure of total double-stranded DNA molecules in said biological sample that map to each of said set of one or more genetic loci based on said quantitative measures of said paired reads and said unpaired reads that map to each genetic locus of said one or more genetic loci. 5. The method of claim 3 , further comprising calculating quantitative measures of at least two of (i) said paired reads, (ii) said unpaired reads that map to each of said set of one or more genetic loci, (iii) read depth of said paired reads, and (iv) read depth of said unpaired reads. 6. The method of claim 5 , further comprising calculating with a programmed computer processor a quantitative measure of total double-stranded DNA molecules in said biological sample that map to each of said set of one or more genetic loci based on said quantitative measures of said at least two of (i) said paired reads, (ii) said unpaired reads that map to each locus, (iii) said read depth of said paired reads, and (iv) said read depth of said unpaired reads. 7. The method of claim 5 , further comprising calculating quantitative measures of said paired reads and said unpaired reads, and calculating a quantitative measure of total double-stranded DNA molecules that map to each of said set of one or more genetic loci based on said quantitative measures of said paired reads and said unpaired reads. 8. The method of claim 1 , wherein said duplex tags are not sequencing adaptors. 9. The method of claim 1 , wherein collapsing said plurality of raw sequence reads comprises collapsing raw sequence reads produced from amplified products of an original polynucleotide molecule in said biological sample back to said original polynucleotide molecule. 10. The method of claim 1 , further comprising identifying polynucleotide molecules at one or more genetic loci comprising a sequence variant. 11. The method of claim 1 , further comprising calculating a quantitative measure of paired reads that map to a genetic locus, wherein both strands of said paired reads comprise a sequence variant. 12. The method of claim 1 , further comprising calculating a quantitative measure of paired molecules in which only one member of said paired molecules bears a sequence variant and/or determining a quantitative measure of unpaired molecules bearing a sequence variant. 13. The method of claim 1 , further comprising attaching adaptors to ends of each of said double-stranded DNA molecules in said biological sample, wherein said adaptors tag a 5′ end of a strand of an individual double-stranded DNA molecule among said double-stranded DNA molecules with a first tag and a 3′ end of a complementary strand of said individual double-stranded DNA molecule with a second tag, thereby providing said tagged strands. 14. The method of claim 13 , further comprising (i) sequencing at least a portion of said tagged strands to produce a set of raw sequence reads, and (ii) mapping said set of raw sequence reads to a genetic locus in a reference genome, wherein said first tag and said second tag are indicative of which strand of said tagged strands each of said set of raw sequence reads is derived. 15. The method of claim 14 , further comprising mapping said set of raw sequence reads to at least one additional genetic locus in said reference genome. 16. The method of claim 13 , wherein said adaptors are from a set of library adaptors comprising a plurality of polynucleotide molecules with said molecular barcodes, wherein said plurality of polynucleotide molecules are less than or equal to 80 nucleotide bases in length, and wherein said molecular b
Massive parallel sequencing · CPC title
for cancer (immunoassay for cancer G01N33/575) · CPC title
Methods for sequencing · CPC title
ICT specially adapted for analysing two-dimensional [2D] or three-dimensional [3D] molecular structures, e.g. structural or functional relations or structure alignment · CPC title
Expression markers · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.