What technology area does this patent fall under?

Primary CPC classification G16B30/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Determination of base modifications of nucleic acids

US11091794B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11091794-B2
Application number	US-202016995607-A
Country	US
Kind code	B2
Filing date	Aug 17, 2020
Priority date	Aug 16, 2019
Publication date	Aug 17, 2021
Grant date	Aug 17, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for using determination of base modification in analyzing nucleic acid molecules and acquiring data for analysis of nucleic acid molecules are described herein. Base modifications may include methylations. Methods to determine base modifications may include using features derived from sequencing. These features may include the pulse width of an optical signal from sequencing bases, the interpulse duration of bases, and the identity of the bases. Machine learning models can be trained to detect the base modifications using these features. The relative modification or methylation levels between haplotypes may indicate a disorder. Modification or methylation statuses may also be used to detect chimeric molecules.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting a modification of a nucleotide in a nucleic acid molecule, the method comprising: (a) receiving data acquired by measuring pulses in an optical signal corresponding to nucleotides sequenced in a sample nucleic acid molecule and obtaining, from the data, values for the following properties: for each nucleotide: an identity of the nucleotide, a position of the nucleotide within the sample nucleic acid molecule, a width of the pulse corresponding to the nucleotide, and an interpulse duration representing a time between the pulse corresponding to the nucleotide and a pulse corresponding to a neighboring nucleotide; (b) creating an input data structure, the input data structure comprising a window of the nucleotides sequenced in the sample nucleic acid molecule, wherein the input data structure includes, for each nucleotide within the window, the properties: the identity of the nucleotide, a position of the nucleotide with respect to a target position within the window, the width of the pulse corresponding to the nucleotide, and the interpulse duration; (c) inputting the input data structure into a model, the model trained by: receiving a first plurality of first data structures, each first data structure of the first plurality of data structures corresponding to a respective window of nucleotides sequenced in a respective nucleic acid molecule of a plurality of first nucleic acid molecules, wherein each of the first nucleic acid molecules is sequenced by measuring pulses in the optical signal corresponding to the nucleotides, wherein the modification has a known first state in a nucleotide at a target position in each window of each first nucleic acid molecule, each first data structure comprising values for the same properties as the input data structure, storing a plurality of first training samples, each including one of the first plurality of first data structures and a first label indicating the first state of the nucleotide at the target position, and optimizing, using the plurality of first training samples, parameters of the model based on outputs of the model matching or not matching corresponding labels of the first labels when the first plurality of first data structures is input to the model, wherein an output of the model specifies whether the nucleotide at the target position in the respective window has the modification, (d) determining, using the model, whether the modification is present in a nucleotide at the target position within the window in the input data structure. 2. The method of claim 1 , wherein: the input data structure is one input data structure of a plurality of input data structures, the sample nucleic acid molecule is one sample nucleic acid molecule of a plurality of sample nucleic acid molecules, the plurality of sample nucleic acid molecules are obtained from a biological sample of a subject, and each input data structure corresponds to a respective window of nucleotides sequenced in a respective sample nucleic acid molecule of the plurality of sample nucleic acid molecules, and the method further comprising: receiving the plurality of input data structures, inputting the plurality of input data structures into the model, and determining, using the model, whether a modification is present in a nucleotide at a target location in the respective window of each input data structure. 3. The method of claim 2 , further comprising: determining the modification is present at one or more nucleotides, and determining a classification of a disorder using the presence of the modification at one or more nucleotides. 4. The method of claim 3 , wherein the disorder comprises cancer. 5. The method of claim 4 , further comprising: determining that the classification of the disorder is that the subject has the disorder, and treating the subject for the disorder by chemotherapy, radiation, or surgery. 6. The method of claim 3 , wherein determining the classification of the disorder uses the number of modifications or the sites of the modifications. 7. The method of claim 2 , wherein the modification is a methylation, the method further comprising: determining the modification is present at one or more nucleotides, and determining a clinically-relevant DNA fraction, a fetal methylation profile, a maternal methylation profile, a presence of an imprinting gene region, or a tissue of origin using the presence of the modification at one or more nucleotides. 8. The method of claim 2 , wherein each sample nucleic acid molecule of the plurality of sample nucleic acid molecules has a size greater than a cutoff size. 9. The method of claim 2 , wherein: the plurality of sample nucleic acid molecules align to a plurality of genomic regions, for each genomic region of the plurality of genomic regions: a number of sample nucleic acid molecules is aligned to the genomic region, the number of sample nucleic acid molecules is greater than a cutoff number. 10. The method of claim 1 , further comprising sequencing the sample nucleic acid molecule. 11. The method of claim 1 , wherein the model includes a machine learning model, a principal component analysis, a convolutional neural network, or a logistic regression. 12. The method of claim 1 , wherein: the window of nucleotides corresponding to the input data structure comprises nucleotides on a first strand of the sample nucleic acid molecule and nucleotides on a second strand of the sample nucleic acid molecule, and the input data structure further comprises for each nucleotide within the window a value of a strand property, the strand property indicating the nucleotide being present on either the first strand or the second strand. 13. The method of claim 12 , wherein the sample nucleic acid molecule is a circular DNA molecule formed by: cutting a double-stranded DNA molecule using a Cas9 complex to form a cut double-stranded DNA molecule, and ligating a hairpin adaptor onto an end of the cut double-stranded DNA molecule. 14. The method of claim 1 , wherein the nucleotides within the window are determined using a circular consensus sequence and without alignment of the sequenced nucleotides to a reference genome. 15. The method of claim 1 , wherein each nucleotide within the window is enriched or filtered. 16. The method of claim 15 , wherein each nucleotide within the window is enriched by: cutting a double-stranded DNA molecule using a Cas9 complex to form a cut double-stranded DNA molecule, and ligating a hairpin adaptor onto an end of the cut double-stranded DNA molecule, or filtered by: selecting double-stranded DNA molecules having a size with a size range. 17. The method of claim 1 , wherein nucleotides within the window are determined without using a circular consensus sequence and without alignment of the sequenced nucleotides to a reference genome. 18. The method of claim 1 , wherein the optical signal is a fluorescence signal from a dye-labeled nucleotide. 19. The method of claim 1 , wherein each window associated with the first plurality of data structures comprises 4 consecutive nucleotides on a first strand of each first nucleic acid molecule.

Assignees

Univ Hong Kong Chinese

Inventors

Classifications

C12Q2565/601
being a microscope, e.g. atomic force microscopy [AFM] · CPC title
C12Q2537/164
Methylation detection other then bisulfite or methylation sensitive restriction endonucleases · CPC title
G16B40/20
Supervised data analysis · CPC title
G16B40/10
Signal processing, e.g. from mass spectrometry [MS] or from PCR · CPC title
G16B30/00Primary
ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title

Patent family

Related publications grouped by family.

View patent family 74567577

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11091794B2 cover?: Systems and methods for using determination of base modification in analyzing nucleic acid molecules and acquiring data for analysis of nucleic acid molecules are described herein. Base modifications may include methylations. Methods to determine base modifications may include using features derived from sequencing. These features may include the pulse width of an optical signal from sequencing…
Who is the assignee on this patent?: Univ Hong Kong Chinese
What technology area does this patent fall under?: Primary CPC classification G16B30/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).