Error correction systems and methods for DNA storage

US12373283B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12373283-B2
Application numberUS-202318356104-A
CountryUS
Kind codeB2
Filing dateJul 20, 2023
Priority dateDec 7, 2022
Publication dateJul 29, 2025
Grant dateJul 29, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A DNA-based storage system includes an error correction system operable to: (a) identify a DNA codeword from a DNA sequencing operation; (b) calculate an initial syndrome weight; (c) determine that the initial syndrome weight is greater than a predetermined threshold; (d) perform an alignment alteration in the information segment by: (i) selecting a skew point within the information segment; (ii) performing an indel operation on the information segment at the skew point; (iii) calculating a modified syndrome weight; (iv) comparing the initial syndrome weight with the modified syndrome weight; and (v) incorporating the indel operation into the information segment when the comparing indicates an improvement in the modified syndrome weight; (e) decode the modified codeword; and (f) transmit the contents of the output file to a computing device, the output file representing user data stored within the DNA molecule.

First claim

Opening claim text (preview).

What is claimed is: 1. A deoxyribonucleic acid (DNA)-based storage system, comprising: an error correction system operable to: identify a DNA codeword, the DNA codeword including an information segment and an error correction segment; calculate an initial syndrome weight for the information segment based, at least in part, on error correction data included in the error correction segment; determine whether the initial syndrome weight is greater than a predetermined threshold; and perform an alignment alteration in the information segment based, at least in part, on determining that the initial syndrome weight is greater than the predetermined threshold, wherein performing the alignment alteration corrects one or more insertion/deletion (indel) errors in the information segment. 2. The DNA-based storage system of claim 1 , wherein performing the alignment alteration comprises: selecting a skew point within the information segment; performing an indel operation on the information segment at the skew point to generate a modified information segment; calculating a modified syndrome weight of the modified information segment based, at least in part, on the error correction data; comparing the initial syndrome weight with the modified syndrome weight; and generating a modified codeword by incorporating the indel operation into the information segment based, at least in part, on the comparison between the initial syndrome weight and the modified syndrome weight. 3. The DNA-based storage system of claim 1 , wherein the indel operation is a deletion operation and wherein performing the indel operation further comprises: deleting, from the information segment, an existing DNA base adjacent to the skew point; and shifting all DNA bases occurring after the skew point within the information segment by at least one base position. 4. The DNA-based storage system of claim 1 , wherein the error correction system is further operable to: determine whether there is known shaping data associated with the DNA codeword; and based, at least in part, on determining that there is known shaping data associated with the DNA codeword: select a first tuple of DNA bases from the information segment, the first tuple of DNA bases occupying a first region of the information segment; compare the first tuple of DNA bases to a set of valid tuples, thereby identifying that the first tuple is invalid; and perform one or more indel operations at one or more skew points within the first region. 5. The DNA-based storage system of claim 4 , wherein the performed one or more indel operations is a first indel operation and wherein the error correction system is further operable to incorporate a second indel operation of the one or more indel operations into the information segment based, at least in part, on determining that the second indel operation results in an improved syndrome weight for the information segment. 6. The DNA-based storage system of claim 1 , wherein the error correction system is further operable to: scan the information segment for an occurrence of one or more error patterns, each error pattern including an ordered sequence of DNA bases; identify, within the information segment, a first occurrence of a first error pattern based, at least in part, on the scanning, the first occurrence occupying a first region of the information segment; and perform one or more indel operations at one or more skew points within the first region. 7. The DNA-based storage system of claim 1 , wherein the error correction system is further operable to: identify an expected length of the information segment; determine a current length of the information segment from the codeword; compute a difference between the expected length and the current length; and perform a number of insertion operations and a number of deletion operations based, at least in part, on the difference, where the performance of the number of insertion operations and the number of deletion operations results in a modified information segment that has a length equal to the expected length. 8. The DNA-based storage system of claim 1 , wherein the indel operation is an insertion operation and wherein performing the indel operation further comprises: shifting all DNA bases occurring after the skew point within the information segment by the at least one base position; and inserting a new DNA base at the skew point. 9. The DNA-based storage system of claim 2 , wherein the error correction system is further operable to: decode the modified codeword to generate at least a portion of an output file, the portion of the output file including data decoded from the modified information segment; and transmit contents of the output file to a computing device, the output file representing user data associated with the DNA codeword. 10. A method for performing error correction on a deoxyribonucleic acid (DNA) codeword generated through sequencing a DNA molecule, the method comprising: identifying a DNA codeword, the DNA codeword including at least an information segment and an associated error correction segment; calculating an initial syndrome weight for the information segment based, at least in part, on error correction data associated with the error correction segment; determining whether the initial syndrome weight is greater than a predetermined threshold; and performing an alignment alteration in the information segment based, at least in part, on determining that the initial syndrome weight is greater than the predetermined threshold, wherein performing the alignment alteration corrects one or more insertion/deletion (indel) errors in the information segment. 11. The method of claim 8 , further comprising: determining whether there is known shaping data associated with the DNA codeword; and based, at least in part, on determining that there is known shaping data associated with the DNA codeword: selecting a first tuple of DNA bases from the information segment, the first tuple of DNA bases occupying a first region of the information segment; comparing the first tuple of DNA bases to a set of valid tuples, thereby identifying that the first tuple is invalid; and performing one or more indel operations at one or more skew points within the first region. 12. The method of claim 11 , wherein the performed one or more indel operation is a first indel operation and wherein the method further comprises incorporating a second indel operation of the one or more indel operations into the information segment responsive to the second indel operation resulting in an improved syndrome weight for the information segment. 13. The method of claim 8 , further comprising: scanning the information segment for an occurrence of one or more error patterns, each error pattern including an ordered sequence of DNA bases; identifying, within the information segment, a first occurrence of a first error pattern based, at least in part, on the scanning, the first occurrence occupying a first region of the information segment; and performing one or more indel operations at one or more skew points within the first region. 14. The method of claim 8 , further comprising: identifying an expected length of the information segment; determining a current length of the information segment from the codeword; computing a difference between the expected length and the current length; and performing a number of insertion operations and a number of deletion operations based, at least in part, on the difference, where the performance results in a modified information segment that has a length equa

Assignees

Inventors

Classifications

  • Data warehousing; Computing architectures · CPC title

  • G16B30/00Primary

    ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title

  • Sequence alignment; Homology search · CPC title

  • to protect a block of data words, e.g. CRC or checksum (G06F11/1076 takes precedence; security arrangements for protecting computers or computer systems against unauthorized activity G06F21/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12373283B2 cover?
A DNA-based storage system includes an error correction system operable to: (a) identify a DNA codeword from a DNA sequencing operation; (b) calculate an initial syndrome weight; (c) determine that the initial syndrome weight is greater than a predetermined threshold; (d) perform an alignment alteration in the information segment by: (i) selecting a skew point within the information segment; (i…
Who is the assignee on this patent?
Western Digital Tech Inc
What technology area does this patent fall under?
Primary CPC classification G16B30/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).