Efficient assembly of oligonucleotides for nucleic acid based data storage

US10956806B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10956806-B2
Application numberUS-201916436043-A
CountryUS
Kind codeB2
Filing dateJun 10, 2019
Priority dateJun 10, 2019
Publication dateMar 23, 2021
Grant dateMar 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for efficient assembly of oligonucleotides for nucleic acid based data storage includes receiving encoded data including binary data encoded into nucleic acid sequence data, and assembling a target nucleic acid data strand based on the encoded data by, concatenating one or more selected codeword oligonucleotides obtained from a codeword stack strand.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for efficient assembly of oligonucleotides for nucleic acid based data storage, comprising: a memory device for storing program code; and at least one processor device operatively coupled to the memory device and configured to execute program code stored on the memory device to: receive encoded data including binary data encoded into nucleic acid sequence data; and assemble a target nucleic acid data strand based on the encoded data by concatenating one or more selected codeword oligonucleotides obtained from a codeword stack strand. 2. The system of claim 1 , wherein the binary data is encoded into nucleic acid sequence data using error correction mapping bit pairs to respective nucleobases. 3. The system of claim 1 , wherein the at least one processor device is further configured to execute program code stored on the memory device to obtain the codeword stack strand. 4. The system of claim 1 , wherein each codeword oligonucleotide of the codeword stack strand includes a payload sequence corresponding to a codeword sandwiched between a pair of primers including a first primer and a second primer and a pair of joint sites. 5. The system of claim 4 , wherein the pair of primers includes orthogonal primers. 6. The system of claim 4 , wherein the at least one processor device is further configured to assemble the target nucleic acid data strand by: selecting a first codeword oligonucleotide for amplification from the codeword stack strand; amplifying the first codeword oligonucleotide to generate a set of first codeword oligonucleotides; cleaving the first primer from each first codeword oligonucleotide of the set of first codeword oligonucleotides; forming a first target sequence based on the set of first codeword oligonucleotides by concatenating a header oligonucleotide with each first codeword oligonucleotide of the set of first codeword oligonucleotides; amplifying the first target sequence to generate a set of first target sequences; and cleaving the second primer from each first target sequence of the set of target sequences to generate a first subsequence. 7. The system of claim 6 , wherein the at least one processor device is further configured to determine that the target nucleic acid data strand has yet to be assembled based on the first subsequence, and select a second codeword oligonucleotide from the codeword stack strand for amplification to assemble the target nucleic acid data strand by concatenating the second codeword oligonucleotide with the first subsequence. 8. A method for efficient assembly of oligonucleotides for nucleic acid based data storage, comprising: receiving encoded data including binary data encoded into nucleic acid sequence data; and assembling a target nucleic acid data strand based on the encoded data by concatenating one or more selected codeword oligonucleotides obtained from a codeword stack strand. 9. The method of claim 8 , wherein the binary data is encoded into nucleic acid sequence data using error correction mapping bit pairs to respective nucleobases. 10. The method of claim 8 , further comprising obtaining the codeword stack strand. 11. The method of claim 8 , wherein each codeword oligonucleotide of the codeword stack strand includes a payload sequence corresponding to a codeword sandwiched between a pair of primers including a first primer and a second primer and a pair of joint sites. 12. The method of claim 11 , wherein the pair of primers includes orthogonal primers. 13. The method of claim 11 , wherein assembling the target nucleic acid data strand further includes: selecting a first codeword oligonucleotide for amplification from the codeword stack strand; amplifying the first codeword oligonucleotide to generate a set of first codeword oligonucleotides; cleaving the first primer from each first codeword oligonucleotide of the set of first codeword oligonucleotides; forming a first target sequence based on the set of first codeword oligonucleotides by concatenating a header oligonucleotide with each first codeword oligonucleotide of the set of first codeword oligonucleotides; amplifying the first target sequence to generate a set of first target sequences; and cleaving the second primer from each first target sequence of the set of target sequences to generate a first subsequence. 14. The method of claim 13 , further comprising determining that the target nucleic acid data strand has yet to be assembled based on the first subsequence, and selecting a second codeword oligonucleotide from the codeword stack strand for amplification to assemble the target nucleic acid data strand by concatenating the second codeword oligonucleotide with the first subsequence. 15. A computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform a method for efficient assembly of oligonucleotides for nucleic acid based data storage, the method performed by the computer comprising: receiving encoded data including binary data encoded into nucleic acid sequence data; and assembling a target nucleic acid data strand based on the encoded data by concatenating one or more selected codeword oligonucleotides obtained from a codeword stack strand. 16. The computer program product of claim 15 , wherein the binary data is encoded into nucleic acid sequence data using error correction mapping bit pairs to respective nucleobases. 17. The computer program product of claim 15 , wherein the method further includes obtaining the codeword stack strand. 18. The computer program product of claim 15 , wherein each codeword oligonucleotide of the codeword stack strand includes a payload sequence corresponding to a codeword sandwiched between a pair of primers including a first primer and a second primer and a pair of joint sites. 19. The computer program product of claim 18 , wherein the pair of primers includes orthogonal primers. 20. The computer program product of claim 18 , wherein assembling the target nucleic acid data strand further includes: selecting a first codeword oligonucleotide for amplification from the codeword stack strand; amplifying the first codeword oligonucleotide to generate a set of first codeword oligonucleotides; cleaving the first primer from each first codeword oligonucleotide of the set of first codeword oligonucleotides; forming a first target sequence based on the set of first codeword oligonucleotides by concatenating a header oligonucleotide with each first codeword oligonucleotide of the set of first codeword oligonucleotides; amplifying the first target sequence to generate a set of first target sequences; and cleaving the second primer from each first target sequence of the set of target sequences to generate a first subsequence; determining that the target nucleic acid data strand has yet to be assembled based on the first subsequence; and selecting a second codeword oligonucleotide from the codeword stack strand for amplification to assemble the target nucleic acid data strand by concatenating the second codeword oligonucleotide with the first subsequence.

Assignees

Inventors

Classifications

  • Sequence assembly · CPC title

  • C07H21/04Primary

    with deoxyribosyl as saccharide radical · CPC title

  • using modified primers or templates · CPC title

  • G06N3/002Primary

    Biomolecular computers, i.e. using biomolecules, proteins, cells (using DNA G06N3/123; using neurons G06N3/061) · CPC title

  • involving nucleic acids · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10956806B2 cover?
A computer-implemented method for efficient assembly of oligonucleotides for nucleic acid based data storage includes receiving encoded data including binary data encoded into nucleic acid sequence data, and assembling a target nucleic acid data strand based on the encoded data by, concatenating one or more selected codeword oligonucleotides obtained from a codeword stack strand.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification C07H21/04. Mapped technology areas include Chemistry & Metallurgy.
When was this patent published?
Publication date Tue Mar 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).