What technology area does this patent fall under?

Primary CPC classification G11C13/0019. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Portable and low-error DNA-based data storage

US10370246B1 · US · B1

Patent metadata
Field	Value
Publication number	US-10370246-B1
Application number	US-201715789519-A
Country	US
Kind code	B1
Filing date	Oct 20, 2017
Priority date	Oct 20, 2016
Publication date	Aug 6, 2019
Grant date	Aug 6, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides DNA-based storage system demonstrated through experimental and theoretical verification that such a platform can easily be implemented in practice using portable, nanopore-based sequencers. The gist of the approach is to design an integrated pipeline that encodes data to avoid synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable nanopore sequencing via new anchored iterative alignment and insertion/deletion error-correcting codes. The embodiments herein represent the only known random access DNA-based data storage system that uses error-prone portable, nanopore-based sequencers and produces low-error readouts with the highest reported information rate and density.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for enabling portable readout of synthetic nucleotide sequences, the method comprising: reading, from a nanopore-based storage device, a plurality of nucleotide sequence blocks, wherein each of the nucleotide sequence blocks as stored in the nanopore-based storage device contains respective address sequences followed by data sequences, wherein each of the data sequences as stored is identical to that of a target nucleotide data sequence, wherein each of the data sequences contains a series of fixed-length substrings and each of the fixed-length substrings is 50% guanine and cytosine and 50% adenine and thymine, and wherein the reading introduces deletion, insertion, or substitution errors into the nucleotide sequence blocks; selecting, by a computing device, a first group of the nucleotide sequence blocks read from the nanopore-based storage device, each having address sequences without any deletion, insertion, or substitution errors; aligning, by the computing device, the data sequences from the first group of nucleotide sequence blocks with one another; and performing, by the computing device, a first consensus procedure over respective aligned nucleotides of the data sequences from the first group of nucleotide sequence blocks, wherein the first consensus procedure produces a first output nucleotide data sequence. 2. The method of claim 1 , wherein the first output nucleotide data sequence matches the target nucleotide data sequence. 3. The method of claim 1 , further comprising: determining that at least one fixed-length substring of the first output nucleotide data sequence is not 50% guanine and cytosine and 50% adenine and thymine; in response to determining that at least one fixed-length substring of the first output nucleotide data sequence is not 50% guanine and cytosine and 50% adenine and thymine, selecting a second group of the nucleotide sequence blocks read from the nanopore-based storage device, each having address sequences with exactly one deletion, insertion, or substitution error; aligning the data sequences from the first group of nucleotide sequence blocks and the data sequences from the second group of nucleotide sequence blocks with one another; and performing a second consensus procedure over respective aligned nucleotides of the data sequences from the first group of nucleotide sequence blocks and the data sequences from the second group of nucleotide sequence blocks, wherein the second consensus procedure produces a second output nucleotide data sequence. 4. The method of claim 3 , wherein the second output nucleotide data sequence matches the target nucleotide data sequence. 5. The method of claim 1 , wherein each of the fixed-length substrings consists of 8 nucleotides. 6. The method of claim 1 , wherein each of the fixed-length substrings contains run length values for runs of one or more consecutive nucleotides therein. 7. The method of claim 6 , wherein the consensus procedure determines deletion, insertion, or substitution errors in the fixed-length substrings of the data sequences based on inconsistences between a number of consecutive nucleotides and an associated run length value. 8. The method of claim 1 , wherein the consensus procedure determines the first output nucleotide data sequence from the data sequences based on a per-nucleotide majority-rule protocol that operates such that the fixed-length substrings of the first output nucleotide data sequence have 50% guanine and cytosine and 50% adenine and thymine. 9. The method of claim 1 , wherein each of the address sequences is 8-32 nucleotides in length, and wherein each of the data sequences is 512-2048 nucleotides in length. 10. The method of claim 1 , wherein each of the address sequences is p nucleotides in length, and wherein a particular address sequence of the address sequences does not appear as a non-address substring in any of the nucleotide sequence blocks. 11. The method of claim 1 , wherein each of the address sequences is p nucleotides in length, and wherein each of the address sequences is a Hamming distance of at least p/2 from one another. 12. The method of claim 1 , wherein each of the address sequences is 50% guanine and cytosine and 50% adenine and thymine. 13. A system comprising: a nanopore-based storage device storing a plurality of nucleotide sequence blocks, wherein each of the nucleotide sequence blocks contains respective address sequences followed by data sequences, wherein each of the data sequences as stored is identical to that of a target nucleotide data sequence, wherein each of the data sequences contains a series of fixed-length substrings and each of the fixed-length substrings is 50% guanine and cytosine and 50% adenine and thymine, and wherein reading from the nanopore-based storage device introduces deletion, insertion, or substitution errors into the nucleotide sequence blocks; and a computing device including a memory storing program instructions that, upon execution by a processor, cause the computing device to perform operations comprising: obtaining the plurality of nucleotide sequence blocks read from the nanopore-based storage device; selecting a first group of the nucleotide sequence blocks, each having address sequences without any deletion, insertion, or substitution errors; aligning the data sequences from the first group of nucleotide sequence blocks with one another; and performing a first consensus procedure over respective aligned nucleotides of the data sequences from the first group of nucleotide sequence blocks, wherein the first consensus procedure produces a first output nucleotide data sequence. 14. The system of claim 13 , wherein the first output nucleotide data sequence matches the target nucleotide data sequence. 15. The system of claim 13 , the operations further comprising: determining that at least one fixed-length substring of the first output nucleotide data sequence is not 50% guanine and cytosine and 50% adenine and thymine; in response to determining that at least one fixed-length substring of the first output nucleotide data sequence is not 50% guanine and cytosine and 50% adenine and thymine, selecting a second group of the nucleotide sequence blocks, each having address sequences with exactly one deletion, insertion, or substitution error; aligning the data sequences from the first group of nucleotide sequence blocks and the data sequences from the second group of nucleotide sequence blocks with one another; and performing a second consensus procedure over respective aligned nucleotides of the data sequences from the first group of nucleotide sequence blocks and the data sequences from the second group of nucleotide sequence blocks, wherein the second consensus procedure produces a second output nucleotide data sequence. 16. The system of claim 15 , wherein the second output nucleotide data sequence matches the target nucleotide data sequence. 17. The system of claim 13 , wherein each of the fixed-length substrings contains run length values for runs of one or more consecutive nucleotides therein. 18. The system of claim 17 , wherein the consensus procedure determines deletion, insertion, or substitution errors in the fixed-length substrings of the data sequences based on inconsistences between a number of consecutive nucleotides and an associated run length value. 19. The system of claim 13 , wherein the consensus procedure determines deletion, insertion, or substitution errors in the data sequences based on a per-nucleoti

Assignees

Univ Illinois

Inventors

Classifications

B82Y5/00
Nanobiotechnology or nanomedicine, e.g. protein engineering or drug delivery · CPC title
G06N3/123
DNA computing · CPC title
G16B30/10
Sequence alignment; Homology search · CPC title
G16B50/50
Compression of genetic data · CPC title
G16B50/00
ICT programming tools or database systems specially adapted for bioinformatics · CPC title

Patent family

Related publications grouped by family.

View patent family 67477765

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10370246B1 cover?: The present disclosure provides DNA-based storage system demonstrated through experimental and theoretical verification that such a platform can easily be implemented in practice using portable, nanopore-based sequencers. The gist of the approach is to design an integrated pipeline that encodes data to avoid synthesis and sequencing errors, enables random access through addressing, and leverage…
Who is the assignee on this patent?: Univ Illinois
What technology area does this patent fall under?: Primary CPC classification G11C13/0019. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Methods of storing information using nucleic acids

High-throughput sequencing of polynucleotides

Error correction for nucleotide data stores

Frequently asked questions