Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing

US11211147B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11211147-B2
Application numberUS-202117179267-A
CountryUS
Kind codeB2
Filing dateFeb 18, 2021
Priority dateFeb 18, 2020
Publication dateDec 28, 2021
Grant dateDec 28, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and software are provided for estimating a circulating tumor fraction for a test subject. Sequence reads are obtained from a panel-enriched sequencing reaction, including sequences for a first plurality of cfDNA fragments corresponding to probe sequences and a second plurality of cfDNA fragments not corresponding to probe sequences. Bin-level coverage ratios are determined from the sequences. Segments are formed by grouping adjacent bins based on similar coverage ratios and segment-level coverage ratios are determined based on bin-level coverage ratios for bins in the segment. For each simulated circulating tumor fraction in a plurality of circulating tumor fractions, segments are fitted to an integer copy state by identifying the integer copy state that best matches the segment-level coverage ratio. The circulating tumor fraction for the test subject is determined using error optimization between segment-level coverage ratios and integer copy states across the simulated circulated tumor fractions.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of estimating a circulating tumor fraction for a test subject from panel-enriched sequencing data for a plurality of sequences, the method comprising: at a computer system having one or more processors, and memory storing one or more programs for execution by the one or more processors: A) obtaining, from a first panel-enriched sequencing reaction, a first plurality of at least 100,000 sequence reads, wherein the first plurality of at least 100,000 sequence reads comprises: (i) a corresponding sequence for each cell-free DNA fragment in a first plurality of cell-free DNA fragments obtained from a liquid biopsy sample from the test subject, wherein each respective cell-free DNA fragment in the first plurality of cell-free DNA fragments corresponds to a respective probe sequence in a plurality of probe sequences used to enrich cell-free DNA fragments in the liquid biopsy sample in the first panel-enriched sequencing reaction; and (ii) a corresponding sequence for each cell-free DNA fragment in a second plurality of cell-free DNA fragments obtained from the liquid biopsy sample, wherein each respective cell-free DNA fragment in the second plurality of cell-free DNA fragments does not correspond to any probe sequence in the plurality of probe sequences; B) determining a plurality of at least 1000 bin-level coverage ratios using the plurality of at least 100,000 sequence reads, each respective bin-level coverage ratio in the plurality of bin-level coverage ratios corresponding to a respective bin in a plurality of at least 1000 bins, wherein: each respective bin in the plurality of bins represents a corresponding region of the genome for the species of the test subject, the plurality of bins collectively covers at least 50 Mb of the genome for the species of the test subject, and each respective bin-level coverage ratio in the plurality of bin-level coverage ratios is determined from a comparison of (i) a number of sequence reads in the plurality of sequence reads that map to the corresponding bin and (ii) a number of sequence reads from one or more reference samples that map to the corresponding bin; C) determining a plurality of segment-level coverage ratios by: forming, using the at least 1000 bin-level coverage ratios, a plurality of segments by grouping respective subsets of adjacent bins in the plurality of bins based on a similarity between the respective coverage ratios of the subset of adjacent bins, and determining, for each respective segment in the plurality of segments, a segment-level coverage ratio based on the corresponding bin-level coverage ratios for each bin in the respective segment; D) fitting, for each respective simulated circulating tumor fraction in a plurality of at least 10 simulated circulating tumor fractions, each respective segment in the plurality of segments to a respective integer copy state in a plurality of at least 4 integer copy states, by identifying the respective integer copy state in the plurality of integer copy states that best matches the segment-level coverage ratio, thereby generating, for each respective simulated circulating tumor fraction in the plurality of simulated tumor fractions, a respective set of integer copy states for the plurality of segments; and E) estimating the circulating tumor fraction for the test subject based on a measure of fit between corresponding segment-level coverage ratios and integer copy states across the plurality of simulated circulated tumor fractions. 2. The method of claim 1 , wherein estimating the circulating tumor fraction comprises minimization of an error between corresponding segment-level coverage ratios and integer copy states across the plurality of simulated circulated tumor fractions. 3. The method of claim 1 , wherein estimating the circulating tumor fraction comprises: identifying a plurality of local minima for the error between corresponding segment-level coverage ratios and integer copy states across the plurality of simulated circulated tumor fractions, and selecting the local minima that is closest to a second estimate of circulating tumor fraction determined by a different methodology. 4. The method of claim 3 , wherein the second estimate of circulating tumor fraction is generated by: (i) detecting a plurality of germline variants in the liquid biopsy sample based on the first plurality of sequence reads; (ii) determining, for each respective germline variant in the plurality of germline variants, a corresponding germline variant allele frequency for the liquid biopsy sample, thereby determining a plurality of germline variant allele frequencies for the liquid biopsy sample; (iii) determining, for each respective germline variant in the plurality of germline variants, an absolute value of the difference between the corresponding germline variant allele frequency for the liquid biopsy sample and a germline variant allele frequency for the respective germline variant allele in a non-cancerous tissue of the subject, thereby determining a plurality of germline variant allele deltas for the liquid biopsy sample; and (iv) estimating the circulating tumor fraction for the liquid biopsy sample as twice the value of the maximum germline variant allele delta in the plurality of germline variant allele deltas. 5. The method of claim 4 , wherein, for each respective germline variant in the plurality of germline variants, the corresponding germline variant allele frequency for the respective germline variant allele in a non-cancerous tissue of the subject is defined as 0.5. 6. The method of claim 4 , wherein, for each respective germline variant in the plurality of germline variants, the corresponding germline variant allele frequency for the respective germline variant allele in a non-cancerous tissue of the subject is determined based on a second sequencing reaction of nucleic acids from a non-cancerous sample of the subject. 7. The method of claim 3 , wherein the second estimate of circulating tumor fraction is generated by: (i) detecting a plurality of somatic variants in the liquid biopsy sample based on the first plurality of sequence reads; (ii) determining, for each respective somatic variant in the plurality of somatic variants, a corresponding somatic variant allele frequency for the liquid biopsy sample, thereby determining a plurality of somatic variant allele frequencies for the liquid biopsy sample; and (iii) estimating the circulating tumor fraction for the liquid biopsy sample as twice the value of the largest somatic variant allele frequency in the plurality of somatic variant allele frequencies. 8. The method of claim 3 , wherein the second estimate of circulating tumor fraction is generated by: (i) detecting a plurality of somatic variants in the liquid biopsy sample based on the first plurality of sequence reads; (ii) determining, for each respective somatic variant in the plurality of somatic variants, a corresponding somatic variant allele frequency for the liquid biopsy sample, thereby determining a plurality of somatic variant allele frequencies for the liquid biopsy sample; and (iii) estimating the circulating tumor fraction for the liquid biopsy sample as the value of the largest somatic variant allele frequency in the plurality of somatic variant allele frequencies. 9. The method of claim 1 , wherein the plurality of probe sequences used to enrich cell-free DNA fragments in the liquid biopsy sample in the first panel-enriched sequencing reaction collectively map to at least 25 different genes in human reference genome. 10. The method of claim 1 , wherein plurality of integer copy states comprises a 1-copy state, a 2-copy state, a 3-copy state, and a

Assignees

Inventors

Classifications

  • G16B30/20Primary

    Sequence assembly · CPC title

  • Ploidy or copy number detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11211147B2 cover?
Methods, systems, and software are provided for estimating a circulating tumor fraction for a test subject. Sequence reads are obtained from a panel-enriched sequencing reaction, including sequences for a first plurality of cfDNA fragments corresponding to probe sequences and a second plurality of cfDNA fragments not corresponding to probe sequences. Bin-level coverage ratios are determined fro…
Who is the assignee on this patent?
Tempus Labs Inc
What technology area does this patent fall under?
Primary CPC classification G16B30/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 28 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).