Systems and methods to identify mutations in mitochondrial genomes

US2022215901A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022215901-A1
Application numberUS-202017613805-A
CountryUS
Kind codeA1
Filing dateMay 26, 2020
Priority dateMay 28, 2019
Publication dateJul 7, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure describes a sequencing system configured to identify structural variants in mitochondrial DNA. Variant callers configured to identify variants in linear genomes (e.g., those found in chromosomes) can fail to properly identify structural variants in mitochondrial DNA. The system and methods can identify structural variants in next generation sequencing data collected from circular, mitochondrial DNA.

First claim

Opening claim text (preview).

What is claimed: 1 . A method to identify variants in mitochondrial sequencing data, comprising: receiving a plurality of sequence reads comprising an indication of sequenced DNA samples; identifying a subset of the plurality of sequence reads, wherein each of the subsets of the plurality of sequence reads partially mapped to a target mitochondrial DNA sample; generating a plurality of query sequences based on each of the subsets of the plurality of sequence reads that are partially mapped to the target mitochondrial DNA sample; calculating a score for each of the subsets of the plurality of sequence reads based on an alignment of the plurality of query sequences with the target mitochondrial DNA sample; selecting a plurality of test reads, wherein the plurality of test reads comprises the subset of the plurality of sequence reads having a score below a predetermined threshold; identifying a breakpoint for each of the plurality of test reads; determining a count of the plurality of test reads having the breakpoint at a predetermined location; and identifying a sequence variant based on the count of the plurality of test reads having the breakpoint at the predetermined location. 2 . The method of claim 1 , wherein the DNA samples are pair-end sequenced. 3 . The method of claim 1 , wherein the sequence variant is one of a deletion, insertion, duplication, or inversion. 4 . The method of claim 1 , further comprising generate a plurality of sequence words from the plurality of query sequences. 5 . The method of claim 4 , further comprising aligning each of the plurality of sequence words from the plurality of query sequences to the target mitochondrial DNA sample. 6 . The method of claim 1 , wherein the score is an e-value indicating a probability of the alignment of each plurality of query sequences occurring by chance. 7 . The method of claim 1 , wherein identifying the breakpoint for one of the plurality of test reads comprises: determining a distance between a first sequence in the one of the plurality of test reads and a second sequence in one of the plurality of test reads is one nucleotide; and determining that a length of a deletion between a first location of the first sequence in the target mitochondrial DNA and a second location of the second sequence in the target mitochondrial DNA is greater than one nucleotide. 8 . The method of claim 1 , wherein identifying the breakpoints for one of the plurality of test reads comprises: determining a distance between a first location of a first sequence in the target mitochondrial DNA and a second location of a second sequence in the target mitochondrial DNA is one nucleotide; and determining a length of an insertion between the first sequence in the one of the plurality of test reads and the second sequence in the one of the plurality of test reads. 9 . The method of claim 1 , wherein identifying the breakpoints for one of the plurality of test reads comprises: determining a distance between a first sequence in the one of the plurality of test reads and a second sequence in the one of the plurality of test reads is one nucleotide; and determining that a length of a duplication between an end location of the first sequence in the target mitochondrial DNA and a start location of the second sequence in the target mitochondrial DNA is greater than one nucleotide. 10 . The method of claim 1 , wherein identifying the breakpoints for one of the plurality of test reads comprises: determining that a location of a sequence in the one of the plurality of test reads overlaps with a location of an inverted sequence in the one of the plurality of test reads; and determining that a location of the sequence in the target mitochondrial DNA does not overlap with a location of the inverted sequence in the target mitochondrial DNA. 11 . The method of claim 1 , further comprising validating the sequence variant against a database comprising known mitochondrial DNA variants. 12 . A system to identify variants in mitochondrial sequencing data, comprising one or more processors to: receive a plurality of sequence reads comprising an indication of sequenced DNA samples; identify a subset of the plurality of sequence reads, wherein each of the subsets of the plurality of sequence reads partially mapped to a target mitochondrial DNA sample; generate a plurality of query sequences based on each of the subsets of the plurality of sequence reads that are partially mapped to the target mitochondrial DNA sample; calculate a score for each of the subsets of the plurality of sequence reads based on an alignment of the plurality of query sequences with the target mitochondrial DNA sample; select a plurality of test reads, wherein the plurality of test reads comprises the subset of the plurality of sequence reads having a score below a predetermined threshold; identify a breakpoint for each of the plurality of test reads; determine a count of the plurality of test reads having the breakpoint at a predetermined location; and identify a sequence variant based on the count of the plurality of test reads having the breakpoint at the predetermined location. 13 . The system of claim 12 , wherein the DNA samples are pair-end sequenced. 14 . The system of claim 12 , wherein the sequence variant is one of a deletion, insertion, duplication, or inversion. 15 . The system of claim 12 , further comprising the one or more processors to generate a plurality of sequence words from the plurality of query sequences. 16 . The system of claim 15 , further comprising the one or more processors to align each of the plurality of sequence words from the plurality of query sequences to the target mitochondrial DNA sample. 17 . The system of claim 12 , wherein the score is an e-value indicating a probability of the alignment of each plurality of query sequences occurring by chance. 18 . The system of claim 15 , further comprising the one or more processors to: determine a distance between a first sequence in the one of the plurality of test reads and a second sequence in one of the plurality of test reads is one nucleotide; determine that a length of a deletion between a first location of the first sequence in the target mitochondrial DNA and a second location of the second sequence in the target mitochondrial DNA is greater than one nucleotide; and calculate the breakpoint for the one of the plurality of test reads based on the distance and the length of the deletion. 19 . The system of claim 15 , further comprising the one or more processors to: determine a distance between a first location of a first sequence in the target mitochondrial DNA and a second location of a second sequence in the target mitochondrial DNA is one nucleotide; determine a length of an insertion between the first sequence in the one of the plurality of test reads and the second sequence in the one of the plurality of test reads; and calculate the breakpoint for the one of the plurality of test reads based on the distance and the length of the insertion. 20 . The system of claim 15 , further comprising the one or more processors to: determine a distance between a first sequence in the one of the plurality of test reads and a second sequence in the one of the plurality of test reads is one nucleotide; and determine that a length of a duplication between an end location of the first sequence in the target mitochondrial DNA and a start location of the second sequence in the tar

Assignees

Inventors

Classifications

  • Sequence alignment; Homology search · CPC title

  • Methods for sequencing · CPC title

  • G16B20/20Primary

    Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022215901A1 cover?
The present disclosure describes a sequencing system configured to identify structural variants in mitochondrial DNA. Variant callers configured to identify variants in linear genomes (e.g., those found in chromosomes) can fail to properly identify structural variants in mitochondrial DNA. The system and methods can identify structural variants in next generation sequencing data collected from …
Who is the assignee on this patent?
Quest Diagnostics Invest Llc
What technology area does this patent fall under?
Primary CPC classification G16B20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 07 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).