Archival data organization and management
US-9092441-B1 · Jul 28, 2015 · US
US9928141B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9928141-B1 |
| Application number | US-201514860706-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 21, 2015 |
| Priority date | Sep 21, 2015 |
| Publication date | Mar 27, 2018 |
| Grant date | Mar 27, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for exploiting variable media sizes to create new redundancy encoded data storage systems are described herein. A set of storage devices is selected based at least in part on each storage device having an available capacity and, using the set of storage devices, a set of shards for a redundancy encoded data storage system is generated such that each shard of the set of shards has a storage capacity corresponding to the minimum available capacity of the set of storage devices.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: under the control of one or more computer systems configured with executable instructions, selecting a first set of shards defined by a first redundancy encoding scheme, each shard of the first set of shards having a corresponding storage device of a set of storage devices configured to store the shard, each corresponding storage device having a consumed capacity and an available capacity, the available capacities of the set of storage devices including a plurality of available capacities; selecting a subset of the set of storage devices, the subset including a number of storage devices sufficient to implement a second redundancy encoding scheme, each storage device of the subset of the set of storage devices selected based at least in part on the available capacity of the storage device being greater than a threshold value; determining a minimum available capacity of the subset of the set of storage devices, the minimum available capacity determined such that each storage device of the subset of the set of storage devices has a corresponding available capacity greater than the minimum available capacity; generating a storage partition on each storage device of the set of storage devices, the storage partition having a storage capacity not greater than the minimum available capacity; and generating a second set of shards defined by a second redundancy encoding scheme, each shard of the second set of shards having a corresponding storage device of the subset of the set of storage devices configured to store the shard in the storage partition of the shard. 2. The computer-implemented method of claim 1 , wherein the first redundancy encoding scheme is a bundle encoding scheme. 3. The computer-implemented method of claim 1 , wherein the second redundancy encoding scheme is a bundle encoding scheme. 4. The computer-implemented method of claim 1 , wherein the first redundancy encoding scheme is a grid encoding scheme. 5. The computer-implemented method of claim 1 , wherein the second redundancy encoding scheme is a grid encoding scheme. 6. The computer-implemented method of claim 1 , wherein the minimum available capacity is determined based at least in part on an available capacity of a storage device of the subset of the set of storage devices that is determined to have less available capacity than any other storage device of the subset. 7. A system, comprising at least one computing device configured to implement one or more services, wherein the one or more services are configured to: select a set of storage devices; select a subset of the set of storage devices such that the subset of the set of storage devices individually and collectively have capacity to implement a grid encoding scheme, the subset of the set of storage devices having a minimum available capacity based at least in part on the minimum available capacity not being greater than a corresponding available capacity of each storage device of the subset of the set of storage devices; and generate a set of shards defined by the grid encoding scheme, each shard of the set of shards: reproducible from a plurality of other shards of the set of shards using the grid encoding scheme; and having a corresponding storage device of the subset of the set of storage devices configured to store the shard in a storage partition of the storage device, the storage partition having a storage capacity not greater than the minimum available capacity. 8. The system of claim 7 , wherein the subset of the set of storage devices includes a number of storage devices sufficient to implement the grid encoding scheme. 9. The system of claim 7 , wherein the grid encoding scheme is a grid of shards, the grid of shards indexed by row and column, the grid of shards comprising a set of data shards and a set of derived shards, wherein each shard of the grid of shards has a corresponding row, a corresponding column, and a corresponding storage device of the subset of the set of storage devices that stores the shard, each shard of the grid of shards configured such that the shard is reproducible from other shards associated with the row and the shard is reproducible from other shards associated with the column. 10. The system of claim 9 , wherein the grid of shards includes at least: a first linear redundancy code, the first linear redundancy code selected such that each shard of the grid of shards is reproducible from other shards associated with the row using the first linear redundancy code; and a second linear redundancy code, the second linear redundancy code selected such that each shard of the grid of shards is reproducible from other shards associated with the column using the second linear redundancy code. 11. The system of claim 10 , wherein: the first linear redundancy code is at least one of: a parity code, a Reed-Solomon code, a repetition code, a cyclic code, a Hamming code, a Reed-Muller code, a Goppa code, a BCH code, a Golay code, or an expander code; and the second linear redundancy code is at least one of: a parity code, a Reed-Solomon code, a repetition code, a cyclic code, a Hamming code, a Reed-Muller code, a Goppa code, a BCH code, a Golay code, or an expander code. 12. The system of claim 10 , the one or more services are further configured to at least: select a shard to repair; select a set of other shards associated with the row; and reproduce the shard from a subset of the set of other shards associated with the row using the first linear redundancy code. 13. The system of claim 10 , the one or more services are further configured to at least: select a shard to repair; select a set of other shards associated with the column; and reproduce the shard from a subset of the set of other shards associated with the column using the second linear redundancy code. 14. The system of claim 7 , wherein each storage device of the set of storage devices is at least one of: magnetic tape, magnetic disk, optical disk, memory resistor, flash memory, or computer memory. 15. The system of claim 7 , wherein each storage device of the set of storage devices is configured to store a corresponding shard of a second set of shards, the second set of shards defined by a second redundancy encoding scheme. 16. The system of claim 15 , wherein the second redundancy encoding scheme is a bundle encoding scheme. 17. The system of claim 15 , wherein the second redundancy encoding scheme is a grid encoding scheme. 18. A non-transitory computer-readable storage medium having stored thereon executable instructions that, upon execution by one or more processors of a computer system, cause the computer system to at least: obtain capacity information for a set of storage devices; select, based at least in part on the capacity information, a subset of the set of storage devices such that each storage device in the subset of the set of storage devices has an available capacity sufficient to implement a grid encoding scheme; and distribute a set of shards among the storage devices of the subset of the set of storage devices based at least in part on the grid encoding scheme. 19. The non-transitory computer-readable storage medium of claim 18 , wherein each shard of the set of shards has a corresponding datacenter location, the corresponding datacenter location based at least in part on a geographical location of a corresponding datacenter. 20. The non-transitory computer-readable storage medium of claim 18 , wherei
Linear codes · CPC title
Parity data used in redundant arrays of independent storages, e.g. in RAID systems · CPC title
using codes or arrangements adapted for a specific type of error (G06F11/1048 takes precedence) · CPC title
with erasure correction and erasure determination, e.g. for packet loss recovery or setting of erasures for the decoding of Reed-Solomon codes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.