Dense tree volume metadata organization
US-2015081966-A1 · Mar 19, 2015 · US
US9405783B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9405783-B2 |
| Application number | US-201314044624-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 2, 2013 |
| Priority date | Oct 2, 2013 |
| Publication date | Aug 2, 2016 |
| Grant date | Aug 2, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, a technique is provided for distributing data and associated metadata within a distributed storage architecture. A set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations are stored for write data of write requests organized into extents. A hash value is generated from a hash function applied to each extent. The hash value is overloaded and used for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a storage location for the extent.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a central processing unit (CPU) of a node of a cluster having additional nodes, each node coupled to one or more solid state drives (SSDs); and a memory coupled to the CPU and configured to store a set of hash tables embodying mappings of cluster-wide identifiers associated with storage locations on the SSDs for write data of write requests organized into extents, the memory further configured to store a storage input/output (I/O) stack having a plurality of layers that cooperate with components of the additional nodes to provide a distributed storage architecture of the cluster, the layers of the storage I/O stack implemented as one or more processes executable by the CPU to: generate a hash value from a hash function applied to each extent; and overload the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. 2. The system of claim 1 wherein overload of the hash value is achieved by applying prime divisors to the hash value. 3. The system of claim 1 wherein the hash function includes a property of block data randomization. 4. The system of claim 1 wherein the hash function includes a property that changing of bits of the extent results in each bit independently having a same chance of changing in the hash value. 5. The system of claim 1 wherein the hash function includes a distribution property that ensures the entries of the selected hash table are substantially balanced and substantially evenly accessed. 6. The system of claim 1 wherein the remainder computation is based on modulus arithmetic. 7. The system of claim 6 wherein the modulus arithmetic comprises a remainder of the hash value modulo a number of buckets. 8. A method comprising: storing a set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations on one or more solid state drives (SSDs) of a distributed storage architecture for write data of write requests organized into extents; generating a hash value from a hash function applied to each extent; and overloading the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. 9. The method of claim 8 wherein overloading the hash value further comprises: applying prime divisors to the hash value. 10. The method of claim 8 wherein the hash function includes a property of data randomization. 11. The method of claim 8 wherein the hash function includes a property that changing of bits of the extent results in each bit independently having a same chance of changing in the hash value. 12. The method of claim 8 wherein the hash function includes a distribution property that ensures the entries of the selected hash table are substantially balanced and substantially evenly accessed. 13. The method of claim 8 wherein the remainder computation is based on modulus arithmetic. 14. The method of claim 13 wherein the modulus arithmetic comprises a remainder of the hash value modulo a number of buckets. 15. A non-transitory computer readable medium including program instructions for execution on one or more processors, the program instructions when executed operable to: store a set of hash tables that embody mappings of cluster-wide identifiers associated with storage locations on one or more solid state drives (SSDs) of a distributed storage architecture for write data of write requests organized into extents; generate a hash value from a hash function applied to each extent; and overload the hash value for multiple purposes within the distributed storage architecture, including (i) a remainder computation on the hash value to select a bucket of a plurality of buckets representative of the extents, (ii) a hash table selector of the hash value to select a hash table from the set of hash tables, and (iii) a hash table index computed from the hash value to select an entry from a plurality of entries of the selected hash table having a cluster-wide identifier identifying a SSD storage location for an extent. 16. The non-transitory computer readable medium of claim 15 wherein overload of the hash value is achieved by applying prime divisors to the hash value. 17. The non-transitory computer readable medium of claim 15 wherein the hash function includes a property of block data randomization. 18. The non-transitory computer readable medium of claim 15 wherein the hash function includes a property that changing of bits of the extent results in each bit independently having a same chance of changing in the hash value. 19. The non-transitory computer readable medium of claim 15 wherein the hash function includes a distribution property that ensures the entries of the selected hash table are substantially balanced and substantially evenly accessed. 20. The non-transitory computer readable medium of claim 15 wherein the remainder computation is based on modulus arithmetic.
Saving storage space on storage systems · CPC title
Physics · mapped topic
De-duplication techniques · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.