Data reduction techniques in a flash-based key/value cluster storage
US-9606870-B1 · Mar 28, 2017 · US
US11429517B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11429517-B2 |
| Application number | US-201816012990-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 20, 2018 |
| Priority date | Jun 20, 2018 |
| Publication date | Aug 30, 2022 |
| Grant date | Aug 30, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A storage system in one embodiment comprises multiple storage nodes each comprising at least one storage device. Each of the storage nodes further comprises a set of processing modules configured to communicate over one or more networks with corresponding sets of processing modules on other ones of the storage nodes. The sets of processing modules of the storage nodes each comprise at least one control module. The storage system is configured to assign portions of a logical address space of the storage system to respective ones of the control modules, to receive a plurality of tracks of data records in a count-key-data format, and to store the tracks in respective ones of the portions of the logical address space assigned to respective ones of the control modules. Each of the tracks is stored in its entirety in the portion of the logical address space assigned to a corresponding one of the control modules.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a content addressable storage system comprising a plurality of storage nodes each comprising at least one storage device, the content addressable storage system comprising separate logical and physical storage layers; each of the storage nodes further comprising: a processor coupled to a memory; and a set of processing modules configured to communicate over one or more networks with corresponding sets of processing modules on other ones of the storage nodes; the sets of processing modules of the storage nodes each comprising at least one control module and at least one data module; the control modules of the sets of processing modules of the storage nodes collectively implementing the logical storage layer of the content addressable storage system by mapping logical blocks targeted by input-output operations to respective content-based signatures for delivery to the data modules; the data modules of the sets of processing modules of the storage nodes collectively implementing the physical storage layer of the content addressable storage system by mapping content-based signatures received from the control modules to respective physical blocks in the storage devices of the storage nodes; the content addressable storage system being configured: to assign different portions of a logical address space of the logical storage layer of the content addressable storage system to respective ones of the control modules; to assign different portions of a content-based signature space of the physical storage layer of the content addressable storage system to respective ones of the data modules; to receive a plurality of tracks of data records in a count-key-data format; and to store the tracks in respective ones of the portions of the logical address space assigned to respective ones of the control modules; wherein each of the tracks is stored in its entirety in the portion of the logical address space assigned to a corresponding one of the control modules; and wherein different data pages of a given one of the tracks are stored in different portions of the content-based signature space assigned to different ones of the data modules in accordance with variations in content between the different data pages. 2. The apparatus of claim 1 wherein the sets of processing modules collectively comprise at least a portion of a distributed storage controller of the storage system. 3. The apparatus of claim 2 wherein the assignment of portions of the logical address space of the storage system to respective ones of the control modules is implemented at least in part by at least one system-wide management module of the distributed storage controller. 4. The apparatus of claim 1 wherein the sets of processing modules each comprise at least one routing module in addition to the control module and the data module. 5. The apparatus of claim 4 wherein the routing module is configured: to receive a given one of the tracks of data records in one or more communications of a first count-key-data protocol; and to route the given track of data records to the control module in one or more communications of a second count-key-data protocol different than the first count-key-data protocol for storage in at least one of the storage devices. 6. The apparatus of claim 4 wherein communications between the routing module and the control module are stateless so as to thereby permit first and second distinct routing modules to communicate with the control module without requiring transfer of state information from the first routing module to the second routing module. 7. The apparatus of claim 1 wherein a native page size of the storage system is less than a size of a given one of the tracks of data records such that the given track of data records is stored utilizing multiple pages of the portion of the logical address space assigned to the corresponding one of the control modules. 8. The apparatus of claim 7 wherein the portion of the logical address space has a size defined as a multiple of the native page size of the storage system. 9. The apparatus of claim 1 wherein at least a subset of the portions of the logical address space comprise respective equal-size portions each comprising a designated number of pages in a native page size of the storage system. 10. The apparatus of claim 9 wherein the designated number of pages in each of the equal-sized portions is at least four pages. 11. The apparatus of claim 1 wherein multiple distinct portions of the logical address space are assigned to at least one of the control modules. 12. The apparatus of claim 1 wherein in conjunction with storing a given one of the tracks of data records in the count-key-data format in one of the portions of the logical address space assigned to one of the control modules, count and key information of the data records is stored in a designated page of a set of pages of the portion and data of the data records is stored in one or more other pages of the set of pages. 13. The apparatus of claim 1 wherein the storage devices comprise respective non-volatile memory devices. 14. A method comprising: configuring a content addressable storage system to include a plurality of storage nodes each comprising at least one storage device, the content addressable storage system comprising separate logical and physical storage layers, each of the storage nodes further comprising a set of processing modules configured to communicate over one or more networks with corresponding sets of processing modules on other ones of the storage nodes, the sets of processing modules each comprising at least one control module and at least one data module; the control modules of the sets of processing modules of the storage nodes collectively implementing the logical storage layer of the content addressable storage system by mapping logical blocks targeted by input-output operations to respective content-based signatures for delivery to the data modules; the data modules of the sets of processing modules of the storage nodes collectively implementing the physical storage layer of the content addressable storage system by mapping content-based signatures received from the control modules to respective physical blocks in the storage devices of the storage nodes; the method further comprising: assigning different portions of a logical address space of the logical storage layer of the content addressable storage system to respective ones of the control modules; assigning different portions of a content-based signature space of the physical storage layer of the content addressable storage system to respective ones of the data modules; receiving a plurality of tracks of data records in a count-key-data format; and storing the tracks in respective ones of the portions of the logical address space assigned to respective ones of the control modules; wherein each of the tracks is stored in its entirety in the portion of the logical address space assigned to a corresponding one of the control modules; wherein different data pages of a given one of the tracks are stored in different portions of the content-based signature space assigned to different ones of the data modules in accordance with variations in content between the different data pages; and wherein the method is performed by at least one processing device comprising a processor coupled to a memory. 15. The method of claim 14 wherein the sets of processing modules each comprise at least one routing module in addition to the control module and the data module. 1
Improving I/O performance · CPC title
De-duplication techniques · CPC title
Performance improvement · CPC title
Networked environment · CPC title
Configuration or reconfiguration · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.