Allelotyping methods for massively parallel sequencing

US11468970B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11468970-B2
Application numberUS-201313952761-A
CountryUS
Kind codeB2
Filing dateJul 29, 2013
Priority dateJul 29, 2013
Publication dateOct 11, 2022
Grant dateOct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one illustrative embodiment, an allelotyping method may include selecting a plurality of text strings that each represent a nucleotide sequence that was read by a massively parallel sequencing (MPS) instrument, where the nucleotide sequences represented by the selected plurality of text strings each correspond to a particular locus, comparing the selected plurality of text strings to one another to determine an abundance count for each unique text string included in the selected plurality of text strings, and determining one or more alleles for the particular locus by comparing the abundance count for each unique text string included in the selected plurality of text strings to an abundance threshold.

First claim

Opening claim text (preview).

The invention claimed is: 1. An allelotyping method comprising: amplifying one or more nucleotide sequences corresponding to a particular locus using a PCR amplification process to produce an amplified sample, wherein the amplified sample produced comprises the one or more nucleotide sequences that correspond to the particular locus and one or more erroneous nucleotide sequences that do not correspond to the particular locus, wherein the one or more erroneous nucleotide sequences are introduced during the PCR amplification process; using a massively parallel sequencing (MPS) instrument to read the one or more nucleotide sequences and the one or more erroneous nucleotide sequences of the amplified sample and to generate a plurality of text strings quantifying the reads of the one of the nucleotide sequences and the one or more erroneous nucleotide sequences of the amplified sample; comparing the plurality of text strings to one another using a text string matching routine to determine an abundance count for each text string included in the plurality of text strings; distinguishing each text string quantifying one of the nucleotide sequences from each text string quantifying one of the erroneous nucleotide sequences based upon their relative abundance counts; and determining one true allele for the particular locus by identifying the text string having a highest abundance count from among the plurality of text strings. 2. The allelotyping method of claim 1 , wherein determining the one true allele for the particular locus comprises identifying a text string from the plurality of text strings that quantifies a nucleotide sequence containing a short tandem repeat (STR). 3. The allelotyping method of claim 1 , wherein determining the one true allele for the particular locus comprises identifying a text string from the plurality of text strings that quantifies a nucleotide sequence containing a single nucleotide polymorphism (SNP). 4. The allelotyping method of claim 1 , further comprising determining another true allele for the particular locus by identifying another text string from among the plurality of text strings for which a ratio of the abundance count for that text string compared to the highest abundance count exceeds an abundance threshold. 5. The allelotyping method of claim 4 , wherein the abundance threshold is a percentage value in the range of 15% to 60%. 6. The allelotyping method of claim 4 , wherein the abundance threshold is a percentage value configurable by a user. 7. The allelotyping method of claim 1 , further comprising: receiving a first text-based computer file comprising the plurality of text strings generated by the MPS instrument; and generating a second text-based computer file comprising the text string identified as representing one true allele for the particular locus, wherein a file size of the second text-based computer file is smaller than a file size of the first text-based computer file. 8. The allelotyping method of claim 7 , wherein the second text-based computer file further comprises another text string identified as representing another true allele for the particular locus, a ratio of the abundance count for the another text string as compared to the highest abundance count exceeding an abundance threshold in the range of 15% to 60%. 9. The allelotyping method of claim 7 , wherein the file size of the second text-based computer file is at least ten-thousand times smaller than the file size of the first text-based computer file. 10. An allelotyping method comprising: amplifying one or more nucleotide sequences corresponding to each of a plurality of loci using a PCR amplification process to produce an amplified sample comprising the one or more nucleotide sequences that correspond to each of the plurality of loci, wherein the amplified sample also comprises erroneous nucleotide sequences introduced during the PCR amplification process; using a massively parallel sequencing (MPS) instrument to read the one or more nucleotide sequences and the erroneous nucleotide sequences of the amplified sample and to generate a plurality of text strings corresponding to each of the plurality of loci, wherein each plurality of text strings quantifies the reads of the one or more nucleotide sequences that correspond to one of the plurality of loci and of associated erroneous nucleotide sequences of the amplified sample; and for each of the plurality of loci, (i) selecting the plurality of text strings corresponding to a selected locus, (ii) comparing the selected plurality of text strings to one another using a text string matching routine to determine an abundance count for each text string included in the selected plurality of text strings, (iii) distinguishing each text string quantifying one of the nucleotide sequences from each text string quantifying one of the erroneous nucleotide sequences based upon their relative abundance counts, and (iv) determining one true allele for the selected locus by identifying the text string having a highest abundance count from among the plurality of text strings corresponding to the selected locus. 11. The allelotyping method of claim 10 , wherein selecting the plurality of text strings corresponding to a selected locus comprises determining whether each text string generated by the MPS instrument when reading the amplified sample includes text characters that represent a flanking nucleotide sequence associated with the selected locus. 12. The allelotyping method of claim 11 , further comprising removing the text characters that represent the flanking nucleotide sequence from each of the selected plurality of text strings prior to comparing the selected plurality of text strings to one another to determine the abundance count for each text string included in the selected plurality of text strings. 13. The allelotyping method of claim 1 , further comprising removing all text characters that do not represent a short tandem repeat (STR) from each of the plurality of text strings prior to comparing the plurality of text strings to one another to determine the abundance count for each text string included in the plurality of text strings. 14. The allelotyping method of claim 1 , wherein comparing the plurality of text strings to one another using the text string matching routine comprises determining whether each text character in one text string from the plurality of text strings exactly matches the similarly positioned text character in another text string from the plurality of text strings. 15. The allelotyping method of claim 1 , wherein comparing the plurality of text strings to one another using the text string matching routine comprises grouping all occurrences of identical text strings together in a list and counting a number of occurrences of each text string to determine the abundance count for that text string.

Assignees

Inventors

Classifications

  • G16B30/00Primary

    ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11468970B2 cover?
In one illustrative embodiment, an allelotyping method may include selecting a plurality of text strings that each represent a nucleotide sequence that was read by a massively parallel sequencing (MPS) instrument, where the nucleotide sequences represented by the selected plurality of text strings each correspond to a particular locus, comparing the selected plurality of text strings to one ano…
Who is the assignee on this patent?
Battelle Memorial Institute
What technology area does this patent fall under?
Primary CPC classification G16B30/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).