Data deduplication with adaptive erasure code redundancy
US-2017257119-A1 · Sep 7, 2017 · US
US11599412B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11599412-B1 |
| Application number | US-202117475522-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 15, 2021 |
| Priority date | Sep 15, 2021 |
| Publication date | Mar 7, 2023 |
| Grant date | Mar 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and computer-readable media are provided for utilizing distributed erasure encoding in a redundant array of independent disks (RAID) system. An example method can include generating a plurality of virtual redundant array of independent disk (vRAID) stripes, each of the plurality of vRAID stripes including a segment having a plurality of data, each of the plurality of data including metadata, the metadata including a checksum of a corresponding data of the plurality of data, distributing the segment of each of the plurality of vRAID stripes over a plurality of virtual nodes, mapping at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes to avoid write-hole issues, and verifying data integrity of the corresponding data of the plurality of data using the checksum of the corresponding data.
Opening claim text (preview).
What is claimed is: 1. A method comprising: generating a plurality of virtual redundant array of independent disk (vRAID) stripes, each of the plurality of vRAID stripes including a segment having a plurality of data, each of the plurality of data including metadata, the metadata including a checksum of a corresponding data of the plurality of data; distributing the segment of each of the plurality of vRAID stripes over a plurality of virtual nodes, the plurality of virtual nodes including a plurality of data chunks and at least one parity chunk; mapping at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes to avoid write-hole issues; and verifying data integrity of the corresponding data of the plurality of data using the checksum of the corresponding data. 2. The method of claim 1 , wherein the plurality of data includes uncompressed and compressed variable-sized data blocks and is generated by destaging and deduplicating compressed or uncompressed data in write log. 3. The method of claim 1 , wherein each of the plurality of virtual nodes are allocated to different storage devices in a cluster. 4. The method of claim 1 , wherein each of the plurality of data chunks and the at least one parity chunk is partitioned into a plurality of virtual storage containers. 5. The method of claim 1 , wherein the mapping of the at least one of logical files, volumes, or objects includes maintaining a tree that maps the at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes. 6. The method of claim 5 , wherein each of the logical files mapped to the tree includes a virtual block address having a virtual node number, a chunk number, a segment number, and an offset number for a corresponding logical file of the logical files. 7. The method of claim 6 , further comprising: utilizing the virtual block address of the corresponding logical file of the logical files to locate the corresponding logical file in the plurality of virtual nodes; and matching the corresponding logical file with a corresponding checksum in a header of the corresponding logical file to ensure data integrity. 8. A system comprising: one or more processors; and at least one computer-readable storage medium having stored therein instructions which, when executed by the one or more processors, cause the system to: generate a plurality of virtual redundant array of independent disk (vRAID) stripes, each of the plurality of vRAID stripes including a segment having a plurality of data, each of the plurality of data including metadata, the metadata including a checksum of a corresponding data of the plurality of data; distribute the segment of each of the plurality of vRAID stripes over a plurality of virtual nodes, the plurality of virtual nodes including a plurality of data chunks and at least one parity chunk; map at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes to avoid write-hole issues; and verify data integrity of the corresponding data of the plurality of data using the checksum of the corresponding data. 9. The system of claim 8 , wherein the plurality of data includes uncompressed and compressed variable-sized data blocks and is generated by destaging and deduplicating compressed or uncompressed data in write log. 10. The system of claim 8 , wherein each of the plurality of virtual nodes are allocated to different storage devices in a cluster. 11. The system of claim 8 , wherein each of the plurality of data chunks and the at least one parity chunk is partitioned into a plurality of virtual storage containers. 12. The system of claim 8 , wherein the map of the at least one of logical files, volumes, or objects includes maintaining a tree that maps the at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes. 13. The system of claim 12 , wherein each of the logical files mapped to the tree includes a virtual block address having a virtual node number, a chunk number, a segment number, and an offset number for a corresponding logical file of the logical files. 14. The system of claim 13 , wherein the instructions which, when executed by the one or more processors, cause the system to: utilize the virtual block address of the corresponding logical file of the logical files to locate the corresponding logical file in the plurality of virtual nodes; and match the corresponding logical file with a corresponding checksum in a header of the corresponding logical file to ensure data integrity. 15. A non-transitory computer-readable storage medium comprising: instructions stored on the non-transitory computer-readable storage medium, the instructions, when executed by one or more processors, cause the one or more processors to: generate a plurality of virtual redundant array of independent disk (vRAID) stripes, each of the plurality of vRAID stripes including a segment having a plurality of data, each of the plurality of data including metadata, the metadata including a checksum of a corresponding data of the plurality of data; distribute the segment of each of the plurality of vRAID stripes over a plurality of virtual nodes, the plurality of virtual nodes including a plurality of data chunks and at least one parity chunk; map at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes to avoid write-hole issues; and verify data integrity of the corresponding data of the plurality of data using the checksum of the corresponding data. 16. The non-transitory computer-readable storage medium of claim 15 , wherein each of the plurality of virtual nodes are allocated to different storage devices in a cluster. 17. The non-transitory computer-readable storage medium of claim 15 , wherein each of the plurality of data chunks and the at least one parity chunk is partitioned into a plurality of virtual storage containers. 18. The non-transitory computer-readable storage medium of claim 15 , wherein the map of the at least one of logical files, volumes, or objects includes maintaining a tree that maps the at least one of logical files, volumes, or objects to the plurality of data chunks and the at least one parity chunk of the plurality of vRAID stripes. 19. The non-transitory computer-readable storage medium of claim 18 , wherein each of the logical files mapped to the tree includes a virtual block address having a virtual node number, a chunk number, a segment number, and an offset number for a corresponding logical file of the logical files. 20. The non-transitory computer-readable storage medium of claim 19 , wherein the instructions, when executed by the one or more processors, cause the one or more processors to: utilize the virtual block address of the corresponding logical file of the logical files to locate the corresponding logical file in the plurality of virtual nodes; and match the corresponding logical file with a corresponding checksum in a header of the corresponding logical file to ensure data integrity.
Disk arrays, e.g. RAID, JBOD · CPC title
at area level, e.g. provisioning of virtual or logical volumes · CPC title
Replication mechanisms · CPC title
in relation to data integrity, e.g. data losses, bit errors · CPC title
to protect a block of data words, e.g. CRC or checksum (G06F11/1076 takes precedence; security arrangements for protecting computers or computer systems against unauthorized activity G06F21/00) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.