Methods and systems for genomic analysis

US10032000B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10032000-B1
Application numberUS-201715639610-A
CountryUS
Kind codeB1
Filing dateJun 30, 2017
Priority dateAug 30, 2013
Publication dateJul 24, 2018
Grant dateJul 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for processing and/or analyzing nucleic acid sequencing data comprises receiving a first data input and a second data input. The first data input comprises untargeted sequencing data generated from a first nucleic acid sample obtained from a subject. The second data input comprises target-specific sequencing data generated from a second nucleic acid sample obtained from the subject. Next, with the aid of a computer processor, the first data input and the second data input are combined to produce a combined data set. Next, an output derived from the combined data set is generated. The output is indicative of the presence or absence of one or more polymorphisms of the first nucleic acid sample and/or the second nucleic acid sample.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for genetic analysis of a subject, comprising: a. subjecting a first nucleic acid sample of said subject to untargeted sequencing to generate untargeted sequencing data; b. subjecting a second nucleic acid sample of said subject to target-specific sequencing to generate target-specific sequencing data, wherein said target-specific sequencing data comprises less than about 60 megabases, and wherein said target-specific sequencing data is at greater coverage than said untargeted sequencing data; and c. using a computer to generate a combined output from said untargeted sequencing data and said target-specific sequencing data, which combined output is indicative of a presence or absence of one or more polymorphisms in at least a portion of a genome of said subject. 2. The method of claim 1 , wherein (c) comprises (1) mapping said untargeted sequencing data onto a first reference sequence to generate a first alignment, and (2) mapping said target-specific sequencing data onto a second reference sequence to generate a second alignment. 3. The method of claim 2 , wherein said first reference sequence and said second reference sequence are the same sequence. 4. The method of claim 2 , further comprising combining said first alignment and said second alignment to yield a combined sequence. 5. The method of claim 4 , further comprising mapping said combined sequence onto a reference sequence. 6. The method of claim 1 , wherein said combined output comprises an alignment that is mappable onto a reference sequence. 7. The method of claim 1 , wherein said target-specific sequencing data is based on targeted sequencing of an exome, specific genes, genomic regions, or a combination thereof. 8. The method of claim 1 , wherein said untargeted sequencing data is generated using one or more random primers or hybridization probes, and wherein said target-specific sequencing data is generated using one or more non-random primers or hybridization probes. 9. The method of claim 8 , wherein said non-random primers comprise primers targeting one or more genes, exons, untranslated regions, or a combination thereof. 10. The method of claim 1 , wherein said untargeted sequencing data is whole genome sequencing data, off-target data arising from a targeted assay, or a combination thereof. 11. The method of claim 10 , wherein said whole genome sequencing data comprises single reads, paired-end reads, or mate-pair reads. 12. The method of claim 10 , wherein said whole genome sequencing data comprises paired-end reads. 13. The method of claim 1 , wherein said first nucleic acid sample and said second nucleic acid sample are derived from a sample of said subject. 14. The method of claim 1 , wherein said target-specific sequencing data comprises a specific portion and a non-specific portion, and wherein at least a portion of said untargeted sequencing data is included in said non-specific portion of said target-specific sequencing data. 15. The method of claim 14 , wherein said untargeted sequencing data is included in said non-specific portion of said target-specific sequencing data. 16. The method of claim 1 , wherein each of said untargeted sequencing data and said target-specific sequencing data comprises variant data, and wherein (c) comprises combining variant data from said untargeted sequencing data and variant data from said target-specific sequencing data into said combined output. 17. The method of claim 16 , wherein said untargeted sequencing data comprises copy number or structural variant data, and wherein said target-specific sequencing data comprises single nucleotide variations (SNV) or insertion deletion polymorphism (indel) data. 18. The method of claim 1 , wherein said target-specific sequencing data comprises up to about 30 megabases. 19. The method of claim 1 , wherein said first nucleic acid sample and said second nucleic acid sample are separately subjected to untargeted sequencing and target-specific sequencing, respectively. 20. The method of claim 1 , wherein (c) comprises removing any redundant sequences. 21. The method of claim 1 , wherein (c) comprises using a computer processor to identify said one or more polymorphisms at a sensitivity greater than or equal to 90%, wherein said one or more polymorphisms include structural variants that are less than 100,000 bases in length. 22. The method of claim 1 , wherein said target-specific sequencing data comprises sequencing data corresponding to less than or equal to about 180,000 exons in an exome of said subject. 23. The method of claim 1 , wherein said first nucleic acid sample is derived from a biological sample of said subject. 24. The method of claim 23 , wherein said biological sample is derived from a blood sample or a tissue biopsy of said subject. 25. The method of claim 1 , wherein said first nucleic acid sample and said second nucleic acid sample are derived from a biological sample of said subject. 26. The method of claim 1 , wherein said untargeted sequencing data of (a) and said target-specific sequencing data of (b) comprise sequencing reads, and wherein generating said combined output of (c) comprises combining said sequencing reads. 27. The method of claim 1 , wherein said target-specific sequencing data comprises from about 30 megabases to less than about 60 megabases. 28. The method of claim 1 , wherein said target-specific sequencing data is at least 30 megabases. 29. The method of claim 1 , wherein said untargeted sequencing of (a) is massively parallel sequencing. 30. The method of claim 1 , wherein said target-specific sequencing of (b) is massively parallel sequencing.

Assignees

Inventors

Classifications

  • G16B30/00Primary

    ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title

  • Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • involving nucleic acids · CPC title

  • G06F19/22Primary

    Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10032000B1 cover?
A computer-implemented method for processing and/or analyzing nucleic acid sequencing data comprises receiving a first data input and a second data input. The first data input comprises untargeted sequencing data generated from a first nucleic acid sample obtained from a subject. The second data input comprises target-specific sequencing data generated from a second nucleic acid sample obtained…
Who is the assignee on this patent?
Personalis Inc
What technology area does this patent fall under?
Primary CPC classification G16B30/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).