Linear genome assembly from three dimensional genome structure

US12027236B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12027236-B2
Application numberUS-201916247502-A
CountryUS
Kind codeB2
Filing dateJan 14, 2019
Priority dateJan 14, 2018
Publication dateJul 2, 2024
Grant dateJul 2, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided herein includes a method for generating an error-corrected genome assembly for an organism comprising: generating a genomic contact map derived from a DNA proximity ligation assay conducted on one or more samples from the organism or a closely related species; superimposing a reference assembled genome derived from whole genome sequencing of one or more samples from the organism on top of the genomic contact map using computer software; correcting errors in the reference assembled genome through a computer user interface to obtain a corrected assembly file, wherein errors in the reference assembled genome are visualized by observing aberrant contacts in the genomic contact map; and applying the corrected assembly file to the reference assembled genome.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for generating an error-corrected genome assembly for an organism, comprising: a) performing whole genome sequencing on one or more samples from the organism or closely related species, wherein a reference assembled genome is generated from the sequencing contains gaps in adjacent contigs or a scaffold; b) performing a DNA proximity ligation assay on one or more samples from the organism or closely related species and generating a genomic contact map derived from the DNA proximity ligation assay conducted on the one or more samples from the organism or a related species; c) superimposing the positions of adjacent contigs or scaffolds from the reference assembled genome derived from the whole genome sequencing of one or more samples from the organism on top of the genomic contact map using computer software, wherein computer software comprises using a density graph; d) correcting errors in the reference assembled genome through a computer user interface, wherein correcting errors comprises incorporating sequences from the genomic contact map derived from the DNA proximity ligation assay thereby filling the gaps in the reference assembled genome, to obtain a corrected assembly file, wherein errors in the reference assembled genome are visualized by observing aberrant contacts in the genomic contact map; and e) generating an error corrected genome assembly data file, wherein the error corrected genome assembly is the final permutated reference assembled genome. 2. The method of claim 1 , wherein the DNA proximity ligation assay is Hi-C. 3. The method of claim 1 , wherein the reference assembled genome is generated using short-read sequencing technology, long-read sequencing technology, insert clones, linkage mapping data, physical mapping data, optical mapping date, or a combination thereof. 4. The method of claim 1 , wherein observing aberrant contacts in the genomic contact map is based, at least in part, on the frequency of contacts between one part of a contig or scaffold and other parts of the same contig or scaffold, or based on the frequency of contact between one part of a contig or scaffold and other contigs and scaffolds, or a combination thereof. 5. The method of claim 4 , wherein the aberrant contacts are misjoins, rearrangements, translocations, inversions, insertion, deletions, repeats, alignment errors, due to features of how the genome folds in three dimensions, cyclic permutations of the chromosomes, or a combination thereof. 6. The method of claim 5 , wherein the translocations are balanced translocations, unbalanced translocations, or a combination thereof. 7. The method of claim 5 , wherein the repeats are tandem repeats. 8. The method of claim 5 , wherein a misjoin comprises a point along the diagonal of the contact map, a translocation comprises an extremely bright arrowhead motif pointing towards the diagonal of the contact map, and an inversion comprises two arrowhead motifs pointing at one another. 9. The method of claim 1 , wherein the organism is an animal or a plant.

Assignees

Inventors

Classifications

  • ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title

  • ICT programming tools or database systems specially adapted for bioinformatics · CPC title

  • ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks · CPC title

  • Sequence assembly · CPC title

  • G16B30/10Primary

    Sequence alignment; Homology search · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12027236B2 cover?
Provided herein includes a method for generating an error-corrected genome assembly for an organism comprising: generating a genomic contact map derived from a DNA proximity ligation assay conducted on one or more samples from the organism or a closely related species; superimposing a reference assembled genome derived from whole genome sequencing of one or more samples from the organism on top…
Who is the assignee on this patent?
Broad Inst Inc, Baylor College Medicine
What technology area does this patent fall under?
Primary CPC classification G16B30/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 02 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).